Get Started. It's Free
or sign up with your email address
ITA Solution by Mind Map: ITA Solution

1. What Are the Core Issues To Be Addressed?

1.1. Phase 1

1.1.1. Currently, there is no comprehensive, deep search mechanism which spans all the data sets as if they were one and intelligently return the results for further processing. In addition, unstructured data which often contains the most critical information, is for the most part not even included as part of the digital case management file under the current platform. This phase will demonstrate as a proof of concept a unique solution for these issues, as outlined below:

1.1.2. Search

1.1.2.1. Most case management systems have a very narrow searching capability since they often restrict themselves to the data within their silo and do not offer many of the advanced features and sophisticated relevance ranking that we have grown accustomed to having in the browser’s search engine.

1.1.3. Cross-Silo Integration

1.1.3.1. By not linking records from different data sources that reference the same person, we lose a wealth of relevant information that would otherwise be extremely important for making decisions that should be determined holistically based on a variety of different sources. All related records will be part of a unified view that can easily display all the different sources of data together as one virtual record or case. There will also be a section for links to contextually relevant, external knowledge sources to be used as reference material.

1.1.4. Unstructured Data

1.1.4.1. The black sheep of most systems, unstructured documents, multimedia, and free-form data are generally swept under the rug because of the difficulty in being able to search them intelligently and then to integrate them into a structured data environment. Nevertheless, studies have shown them to be more valuable than structured data and so they must be addressed and accommodated for.

1.1.5. Implementation and Deployment

1.1.5.1. Installation and Preparation

1.1.5.1.1. Given the sensitivity of the data, for a proof of concept it would be sufficient to get un-identifiable data and to use the demo GSA from SOC. Additionally, if it is determined that the data would require cleansing, then the appropriate tool, such as Google refine 2.0, or Data Ladder, will be used to create a single new table representing the mapping between the old data and the new transformed and cleaned-up data. this transformation table is used so as to not alter the original data, yet can be incorporated into the solution transparently by having GSA index it as part of its views as described below. This data cleansing can happen at any time and is not a prerequisite for the application to run.

1.1.5.2. Google Search Appliance (GSA)

1.1.5.2.1. Since GSA can index not only on data tables but also on “views” of merged tables, we can combine as many tables as possible to create “super-records” that represent as much of the combined related information as possible. There may be many different super-records in any given data set that we will call a “cluster”, which is a set of interrelated super-records from a given data set. Furthermore, if we can join or merge records and super-records across different clusters, then that too can be a view and therefore also indexed as one big record spanning across multiple datasets, which we will refer to as a ”uni-record.” The resulting index of all three types of records is what will be searched every time a user submits a query, and the GSA will return all relevant records in sorted order as an XML structure which is then processed by the ITA portion of the solution. Note that if the transformation table is created from the data cleansing described above, then the transformed data can also be incorporated either instead of or in addition to the original data.

1.1.5.3. ITA Dashboard

1.1.5.3.1. Through this graphical interface, the user will have several options on how to search: 1) free-form keywords using the traditional, Google search box; 2) selecting exactly which fields from which data set to search against, or 3) using an existing record with its field values filled in as a template to match against. The result set that is returned from GSA will then be processed and filtered before being displayed to the user as the possible candidates based on their search query. The user can then select the appropriate case record and submit a request to retrieve all linked records from any or all of the remaining data sets in order to have a complete, holistic view of that case from at least the perspective of the data sets available. The success of being able to link records referring to the same case across different data sets relies heavily on the system’s ability to do “identity matching” as described below.

1.1.5.4. Identity Matching

1.1.5.4.1. Data Quality

1.1.5.4.2. Fuzzy Field Comparison

1.1.5.4.3. Fuzzy Record Comparison

1.2. Phase 2

1.2.1. Alerts and Notifications

1.2.1.1. While alerts can be valuable within a data set, their true power is when they can monitor and report based on any combination of fields that come from any of the diverse data sets. This will allow us to monitor and notify the correct people of situations that can only be described using a diverse range of data sources, such as school records, Justice Department records, health records, social programs, building department records, etc.

1.2.2. Reporting and Data Visualization

1.2.2.1. The user should be able to take any results returned from a search and view it as a chart, graph, or map. As a more advanced feature, the user should also be able to overlay any other data set or points of interest on these visualizations to provide a comparison or additional relevant information. Reports can also be generated to show aggregate and summary information which can be exported in various formats.

1.2.3. Predictive Analytics and Machine Learning

1.2.3.1. Using all available historical data, we can leverage Google’s Prediction API or other competing statistical packages supporting regression analysis in order to better predict outcomes and to incorporate machine learning algorithms that could suggest conditions and factors which should be incorporated into the alerts and notification system as well as the reporting module.Using all available historical data, we can leverage Google’s Prediction API or other competing statistical packages supporting regression analysis in order to better predict outcomes and to incorporate machine learning algorithms that could suggest conditions and factors which should be incorporated into the alerts and notification system as well as the reporting module.

1.3. Phase 3

1.3.1. Complete Workflow Automation and Tracking

1.3.1.1. Converting all the workflow logic and processes into a guided, wizard-like process with full information tracking, context driven help, a checks and balance system that would alert and report any contradictions, conflicts, or misinformation via a rule-based engine that would determine who needs to get contacted when, using which communication media, and the appropriate follow-up actions. Reports can dynamically be generated to show either specific information about cases or as aggregate summary reports that reflect the success and failure of the current workflow.

1.3.2. Collaboration, Interaction, and Multimedia Repository

1.3.2.1. As an extension of the workflow automation and tracking, collaboration on any document of any type should be easily and readily available to any predefined groups of users based on their role and permission settings. Documents ultimately produced will then become part of the repository as well as being integrated into the workflow. All documents and files, including images and video, will be parsed and indexed so that they are available for specific, detailed searches even within the videos themselves.

1.4. Phase 4

1.4.1. Communications Platform

1.4.1.1. With smartphones rapidly becoming the communication media of choice, especially among the low income immigrants as well as close to 60% of the homeless population, it is a perfect opportunity to take all forms of communications to a new level and offer an unprecedented level of service, information, and interaction between the organization and its clients. At its core, the platform can accommodate text messages, voice calls, or email and offer a plethora of services and information on demand, without needing to unnecessarily tie up a worker’s time, such as direct access to FAQ’s, a 311 type of service, the ability to register for various alerts, updates, and announcements, access to documents and forms, emergency service, etc. all of which would be available in their native language. All communications would be logged with their case file and therefore can be reviewed, searched, and analyzed like all other data.

1.4.2. Migration to Next-Generation Technology

1.4.2.1. All of the previous stages were able to be deployed without altering any of the existing legacy systems or platforms. However, at some point it is advisable to migrate from the old paradigm to the new paradigms that are quickly becoming the new standard. This will allow any future enhancements or development to be deployable much more quickly and with greater flexibility and growth potential.

2. Overview

2.1. As an organization grows and continually adds new programs and projects, the amount and the complexity of the data grows even more quickly and often reaches a point where it is not only difficult to search for information concerning a particular client within a given database, but almost impossible to find all the information for that client across all of the different programs’ databases. Answers to seemingly simple questions like, “how many different programs has this client touched?” become unanswerable, This, then, represents the primary focus of the following proof of concept. This document describes an innovative approach to handling information sharing and case management requirements broadly by incorporating Google’s search technology, fuzzy comparisons, and predictive analytics in order to unify both structured and unstructured disparate data sources within a single view. We divide our Roadmap vision into 4 independent phases, and then afterwards provide a more detailed description for the Phase I implementation and deployment, which will be our proof of concept. In a nutshell, our solution and deliverables from phase 1 is a hybrid system composed of Google’s Search Appliance as the backend search layer and our Insights to Action (ITA) Dashboard which provides a universal view of all the relevant data, handling identity matching and offering contextual links to external knowledge sources. This solution can then serve as a showcase for both the technology and the validation of the approach which will lead to the next phases of development.