wiki:currentDesign

Version 3.x Design Ideas

This page contains version 3.x backend data design notes. See the UI Design for UI related design information. The Design Goals page and Schedule page contain other related development information.

Topics

  • Umbrella for health indicators (IP and QM) and in most cases, provides a starting point/main driver for novice users. Each major health topic area should have a clean welcome type page that is for a new user. This welcome page would include an inviting slider that draws a new user can more easily identify with and understand their options. Graphics, embedded videos, and clean related links and resources should be included. A link should also be provided to the topic's detailed information page which contains links to associated IP reports, query datasets, and interesting community reports, publications and other related content.
  • Topics will be handled with HTML_CONTENT xml for maximum flexibility. It would be wise to define the general categories for all adopters so that if this ever needs to be put into a db or other structured storage mechanism that it easily can be.
  • Could have set of IP datasets for each type of topic then have a community comparison report option for any given topic - like the England PHO Framework.
  • IPs can be associated/link back to 0:n Topics.
  • QMs do NOT have a direct association/link back to Topics as of July, 2016.

High Level Indicator, View & Dataset Definition

  • Indicator profiles are a mix of contextual information, metadata, and numerical values.
  • Indicator Views are special indicator reports that use the main indicator contextual data, adds more specific “view” contextual data, and uses one or more IP datasets.
  • An indicator, it's views, it's datasets are 100% owned by only one organization.
  • An indicator can have multiple numerical datasets.
  • An indicator dataset is based on the same data sources and measure.
  • An indicator view can reference/use/consume any of its indicator's datasets that have the same period and measure.
  • An indicator view must have at least one category with optional series and constants.
  • An indicator view must use its period either as a series, category, or constant.
  • An indicator view may at some point reference/use/consume a different indicator's datasets data (however, see future note below).
  • The biggest change is “View Values” morphing into one or more datasets. In the past IPs had one or more IPVs which had one or more IPV VALUES. The IPVs were very specific datasets that in some cases contained mixed data sourced values because different data were being compared (think county, state and US – each being a series). Each view will likely require multiple dataset definitions (datasets are like the QM results). Also, datasets are required to have a period dimension so that all values are trendable.
  • An indicator will have an optional NOT_SELECTABLE_FLAG that hides that IP from selection/assoc index pages. These are considered a data support IP which provide data for other uses like community reports etc. At one point this flag also existed for the IP view. However, this is not needed as the view is simply an indicator report and is not consumed by anything else. The dataset does have consumable data but no UI navigation indexes/links will refer to them so it is not needed.
  • IPV.Y_TITLE goes away as the dataset MEASURE TITLE will be used.
  • IPV charts/graphs/maps/data tables will by default have code that builds a “synopsis” heading (the chart/map title). These code built synopses will be based on a set of defined rules that can include fields like indicator title, view title, period dimension, other dimensions etc. The IPV.SUB_TITLE and IPV.PERIOD_TITLE go away as the period dimension is used etc.
  • Views will have a SYNOPSIS override to support 100% editor specified custom chart/map titles/headings.
  • View creators will need to smart in how they choose which dimensions and dimension value are to be used for display. Just because a dataset is setup having a given set of dimension values that combination might not exist in the actual entered dataset value records.
  • Views can have independent maps and charts. This is because a chart can be 1 or 2 dimensional with either 1 or 2 or the 3 dimensions being held constant. Maps are one dimensional and MUST hold the second/third non geo dimension values constant.
  • Two types of views:
    1. Surrogate view. This consists of a URL which allows a non standard IPV page to be included as part of the normal indicator report navigation. This includes the ability to use IBIS-PH HTML_CONTENT pages or link to outside pages. These pages are will not have any special processing and will simply make use of the HTML IFRAME element. Note that these special page must be hand coded and maintained - Graphs, data tables, contextual fields – all of it. This option provides total flexibility to present a view.
    2. The standard IPV. Uses XSLT code to pull the data from XML files and builds a standardized “view” page.
  • At some future point, end users will be allowed to build their own IP report. They can do data discovery and combine like value types etc. The self registered user save query functionality could be extended to support saving and sharing these definitions as well as creating system type definitions.
  • At some future point, editors could combine different measures because they know what they're doing and what they're trying to convey. The public will only be able to build their own report that compares like measures of the same indicator.

Indicator Datasets

  • Indicator datasets are similar to the QMs datasets which have their own specific data source(s). Datasets should NOT be general and all inclusive. They should be discrete chunks of data from the specified datasource (and in some cases datasources).
  • Datasets will have some metadata like data sources but are mainly the numerical data values. They are 2-d or 3-d cubes organized by mandatory period dimension and 1 or 2 other group by dimensions (used in a “view” as category and series).
  • Indicator dataset numerical measure values are ALL of the same “value type”.
  • Datasets are data source based - just like IBISQ datasets.
  • Datasets can have one or more data sources.
  • Datasets are tied to an indicator (INDICATOR_NAME) so that meta data is easily found for CP reports.
  • Datasets have a mandatory period dimension. The period can be a single "all years combined", set of year ranges, or series of year values. This will allow for future trending and takes care of having to specify the period title.
  • Will have at most 2 other dimensions - one of which should be a community type dim - max of 3 total: period, community, other.
  • Dataset value records will use a similar query result XML structure - RECORDS, MEASURE, DIMENSIONS, ANCILLARY_VALUES, VALUE_ATTRIBUTE.
  • NUMER and DENOM will still be at the RECORD level and not the dataset level as per Lois 11/2015.
  • Value type is accessed via the MEASURE.
  • Numeric fields (measure, LCL, UCL, numerator, denominator) are required to be 100% numeric - no special characters which require kludged code and processing rules. For special “insufficient data” type values, one or more “value attribute”s can be assigned to any given data record.
  • Datasets can be defined and populated with a variety of different dimensions. This is valid as long as they all have the same measure and datasource. The main challenge is not a technical issue but a content issue e.g. providing a complete data cube with all combinations for data consumption.

Future Dataset

  • Indicator profile reports *will*, at some future point, support consuming/referencing other indicator datasets (as long as the “value type” is the same). For example the ecig survey data that can be used for esig, smoking, chew).
  • Datasets are 100% IP based but there are some that are not. Also datasets will need to reference the datasource set they are based on. The name will have to be like "IP name.sub data source name(s).xml.
  • Datasets *MAY* be broken down into small XML data files where each file is a subset with same dimensions and/or same measure(s).
  • There are use cases where it might be desirable to have a dataset be based on an IBISQ dataset. A request controller would have to intercept the XML data request and dynamically run the IBISQ saved query and the do whatever further model map processing (e.g. merging with a static supplemental IP dataset or another saved query etc). This is not wanted behavior for IP, CP reports due to value must be more tightly screened and the IBISQ regression testing of a dataset is too much work for adopters. A dynamic solution also would be impacted by hardware performance and as stated would require that each dataset be inspected by Java code to determine if it needs to run the query. For IP datasets a saved query URL definition provides a mechanism to update the static HI datasets via a saved query with IP datasets following the traditional approval and publish cycles.

Data Sources

  • Data sources will contain the core of the "data notes" and "data issues". The IPV can contain specific, supplemental "data notes" and "data issues" that are related to the “view/report”.
  • IBISQ query modules should be updated to reference or include the new data sources with the specific supplemental notes being with the QM CONFIGURATION - like they are now.
  • At some point datasets and IBISQ QMs can either embed the DS information or reference it. V3.0 will likely embed the DS into the IP datasets.

Dimensions

  • There are three types of dimensions: 1) Period 2) community, and 3) any other which is not period related or considered a community. Dimensions are used for discovery and comparisons.
  • Period dimensions provide the basis for trending and are either set of discrete years or a set of year ranges. Adopters may create a blank period dimension for those special cases where period does not apply but it is highly recommended to specify a period if at all possible.
  • Period dimension will have a special "PERIOD_FLAG".
  • Dimensions can set a COMMUNITY_FLAG which simply makes it easier to determine which dims can be selected and presented as a community.
  • Typical Community Dimensions include:
    • Sex
    • Race
    • Ethnicity
    • Geo Area (county, region, etc)
    • Age (age group)
    • Education
    • Income
    • Marital status
    • Employment
    • Disability
    • Natality
    • Veteran

Measures

  • Measures are the dataset's records specific value type and are are best conceptualized as a chart's "Y Title".
  • Measures are a specific superset of version 2.3's "Value Type". They have a name, title, (which will be used for the IPV Y title) and format pattern.
  • Measures will be defined centrally and will be picked at the time of dataset definition.
  • Measures are defined by Goldilocks; they have to be just right for what they're being used for. As generic as possible but specific enough to not allow comparison misuse of the data. The usage context must be considered (e.g. the indicator title, the possible series titles, and comparisons). Given that the y title is within the context of the health indicator it can make sense to use a more generic “value type” "Rate per 100,000". However, if an indicator view/report contains different types of “Rate per 100,000” measures then a more specific measure(s) should be used.
  • Unlike the IBIS-Q dataset, there can NOT be multiple measures per record as of July 2016. This was discussed with Lois and there is not an immediate need for now. See the “Peer Group” and “Rank” sections for multiple measure needs.
  • Some examples of measures are:
    • Count
    • Age Adjusted Rate per 100,000
    • Rate per 100,000
    • Rate per 1,000
    • Discharges per 100,000 (may or may not be needed to be this specific - depends on context and related datasets)
    • Percentage of Live Born Infants
    • Age Adjusted %
    • % Live Birth Infants
    • Currency

Peer Groups

  • Currently thought to use the ANCILLARY VALUE mechanism within a demographic type indicator.
  • Peer groups allow a user to find and compare various indicators for a given community/group.
  • Peer groupings associate a general indicator range rank value to a dimension (that has the COMMUNITY_FLAG set). This allows a user to see any and all dimensions (dimension values) that have the same “peer group value” for a given indicator(s). This provides the foundation for doing data discovery “what ifs” e.g. I want to see any dimension values (communities) that have the same IP peer group rankings (income, birth rate, and education level) as my dimension of interest (Grand County). Once that set of peers is defined a user could choose a set of indicators to report on to build their own report. That definition could then be saved and shared much like a user can do with a query definition.
  • Here's a few basic demographic examples (with the associated dimension):
  1. PERSONAL INCOME: age, geo, race, eth, edu
  2. HOUSEHOLD INCOME: age, geo, race, eth, edu
  3. POPULATION DENSITY: geo
  4. POPULATION SIZE: age, geo, race, eth, edu
  5. EDUCATION (grad rates, birth mom, proficiency, lunch programs): age, geo, race, eth
  6. FAMILY SIZE: geo, race, eth
  7. EMPLOYMENT: geo, race, eth
  8. CHILD CARE
  9. HIGH SCHOOL GRADUATION RATE

And examples of non demographic health indicators (community profile type reports):

  1. Peer breast cancer counties.
  2. Peer diabetes groups (like county, LHD, sex, age group).
  • Peer group limitations. Editors will need to be aware of the range of peer group values and their global usage.

Rankings

  • Currently thought to use the ANCILLARY VALUE mechanism.
  • Rankings are like peer groups but can be different based on the usage context (think state or US high/low/same).
  • It was first thought to put a simple “RANK” field in the dataset values structure. This would allow adopters to establish basic ranking values. It also provide a second type of specialized measure that can be used for either peer groupings or HLS ratings etc. without having to implement a truly dynamic multi measure/multi peer group solution. The dataset definition would have included ranking definition text field to provide the actual usage definition.
  • The main issue using RANK is that it is complicated when considering usages among the different datasets. At this point the main use case is for HLS choropleth maps. The real solution to this is to implement code that correctly “ranks” the data point based on some base value and the record's conf limits. While a single RANK is a good start and better than nothing it is also It would also more than just demographic peer groupings. A given IP dataset value may need peer grouping and a state high/low/same ranking. The simple solution does not allow and the IP editor to have both (or more peer groups, rankings) – they must choose one over all the others. Put another way, this solution does not allow for peer groups within a dimension or specific context – it's global. For general demographic type rankings it is probably not an issue but if the ranking needs to be within the context of the state, and/or the US, and/or within the dimension set, and/or high/low/same use cases then this simple one generic RANK value field does not work.
  • Future use cases would more easily allow “what if” types of data discovery. I want to see all the peers to my county that have the same income, family size, and employment for a given year. Then add in some comparisons like showing the US, STATE, and other selected communities.
  • * For now, put on hold. If adopters use/need this (and funding) then can implement a robust solution.

Demographics / Dimension Metadata

  • Demographics are indicator/dataset based - just like any other indicator profile. This allows demographic data to be graphed, trended, and have contextual/metadata text.
  • Demographic indicators will set a DEMOGRAPHIC_FLAG to make it easy to identify those types of datasets.
  • The dataset's ancillary "RANK" value field can be used identify peer communities. Implementing a general RANK field has an issue of what is it – High/low/same, peer group ranking, relative to state?, relative to US? relative to a specific group? * Maybe need to have the dataset definition include a RANK_DEFINITION field? What about those instances where you want both – HLS RANK and PEER GROUP? This is a future enhancement that can be added later if funding.
  • At some point could implement a user defined search for all demographic indicators that are peers with x,y, and z. Then a community profile report could be built for a set of indicators for that group of peer demographic areas.
  • Originally it was considered to put demographic metadata and rankings in with dimension values which provides one stop shopping - when viewing a given community all the demo data is available by simply selecting that dimension value. Downside is that there is not any trending and is limited to a predefined set of values. This also doesn't help when considering non demographic dimensions?
  • The kendo grid can be used for semi sophisticated drill downs. For example say you want to limit a community to a certain subset of demo rankings you could "filter" the grid for any values and any combination of columns (provided the data is in the datasource). The leaflet map code needs to be updated to be datasource based for this to work.
  • See emails and design docs on this but the basics are:

Objectives and Targets

  • HPO will be replaced with a general purpose INITIATIVE, INITIATIVE_TOPIC, and INITIATIVE_TOPIC_OBJECTIVE. These tables will provide the contextual fields for HPO and state type health objective definitions. There are optional URL and NARRATIVE fields at each level as well as a general TARGET textual field in the objective structure.
  • There will be UI and tables that allow indicator editors to associate any INITIATIVE_TOPIC_OBJECTIVE to an indicator and/or view. This provides a mechanism to have multiple target values.
  • Both indicator and views will have an optional OTHER_OBJECTIVE field that can be used along with the initiative objective structures. Objectives will have the following order:
  1. Indicator related objectives based on their sort order
  2. Indicator View related objectives based on their sort order
  3. Indicator OTHER_OBJECTIVE text
  4. Indicator view OTHER_OBJECTIVE text
  • Numerical target values to be displayed on charts or within a record's values are not part of the objective structure. This functionality was determined by Kim and Lois to be implemented as a series in the appropriate dataset.
  • Numerical target value datasets will have a OBJECTIVE_TARGET_FLAG that is used by the UI to apply a different set of visual properties to the chart.

Community Profile

  • Community Profile Snapshot and Highlight reports are IP, and IP dataset based. They both rely heavily on IP context fields and may use IPV context fields if needed.
  • New IP target values and other related community values can be used for the comparison at some future point. Could have the user specify. Could have user build their own report and save definition and share - like saved query.
  • The index pages will likely remain as is – possibly refactored to clean up.

HL Quick Facts Page

  • Will continue to refactor page.
  • Demographic data will come from referencing the specified IP datasets by community dimension and comparison dataset dimension values.
  • Chart will be refactored to utilize the Kendo charting package. Will need to specify set of indicators, which IP dataset dimensions to use for comparisons, and the comparison period value.
  • Community thumbnail will be a simple image URL. This is because to dynamically do a map for geo communities would require allot of resources and would be slow. Also for SA geos this would be useless. If implemented as a graphic URL then could have a non map/appropriate image for Race, Ethn, Sex, and other non geo communities.
  • Community contact info and other metadata will be refactored as time permits.
  • Map, like the thumbnail, would be a simple URL. This allows for embedding google map/ESRI URL etc or other static graphic image.
  • Choosing another community will still be the drop down based on the defined dimension values. One option is to implement a popup link that launches the IBIS map with the geo area selected - much like the query selection map works.

Social Media and Pushing News etc.

  • Create and use social media (like a twitter account) to push info to interested ibis users. This content can also be included on the welcome page's aside area etc. See England's PHOF site.
  • Every adopter should consider creating and maintain social media accounts.
  • Note that adopters can have numerous specialized system type user accounts for saved queries that can be used for sharing links etc.

Selections

  • Cat index will be replaced with the standard SELECTION LIST XML type file(s).

Query Modules

  • QM datasets will use the same new IP record structure dimension and measure definitions so we can do data discovery and comparisons.
  • QM and IP datasets will differ not in RECORD structure but in the way the groupby mechanism is specified and used.
  • QMs will need to be updated to share the same data sources and dimensions.
  • QMs will need to be updated to either include a PERIOD_DIMENSION or the view app will need to auto add an "all years" PERIOD_DIMENSION to the QM result.
  • QMs will need to have their CONFIGURATION/MEASURES related XML code cleaned up.
  • Backend QM configs will be changed to the simpler, more flexible XML control.
  • QM configuration structure and XSLT code can be modified have an optional INDICATOR_NAME and TOPIC_NAME associated so that a user can link to an IP(s) and to topic(s).
Last modified 13 months ago Last modified on 12/14/17 23:48:56