Navy Automated Data Cleansing with Machine Learning

Customer Challenge

Poor data quality is hindering the Department of Navy’s (DON) ability to gain valuable and accurate insight from their data. Given the volume of errors, manual correction is ineffective and inefficient.

Innovative Solution

ILW data scientists implemented Phase I of our Automated Data Cleansing and Analysis Tool (ADCAT), which applies machine learning (ML) and probabilistic graphical modeling (PGM) to automatically cleanse DON data of errors.

For Phase II, ILW is currently applying algorithm enhancements, optimization, model quality monitoring, and user interface creation for improved healing functionality across domains as well as preparing for deployment in the DON environment.

Benefits/Outcomes

  • Robust natural language processing (NLP) and ML classifier models, achieve 96 – 99.8% accuracy
  • ADCAT’s PGMs provide end-users with the five most probable corrections for a given error; 98% of the time the correct value was in the top five most probable values
  • Exposes black box of ML error correction logic by providing transparent, human-understandable explanations
  • Scalable processes and automatic discovery methods enable new error correction models to be built quickly
  • Human-in-the-loop solution is available to enable review and validation of the ML-driven error corrections

Business Value

  • Improved analyst productivity: less time correcting data, increased focus on core mission tasks
  • Higher quality data: higher-confidence, data-informed decisions, cost savings

Toolbox

  • Supervised/unsupervised ML
  • Probabilistic graphical model (Bayesian)
  • Open-source Python solution using DoD-compatible libraries
  • Categorical, ordinal, and string data types
  • NAVAIR maintenance data
  • NAVSEA labor data

Related Case Studies You May Like

Webinar: Blackhawk Helicopter Preflight Inspection Augmented Reality Application

Webinar: Blackhawk Helicopter Preflight Inspection Augmented Reality Application

Webinar: Paint Hangar Environmental Monitoring

Webinar: Paint Hangar Environmental Monitoring

Webinar: Making the Most of Your Sensors — Smart Facilities, Factories, and Labs

Webinar: Making the Most of Your Sensors — Smart Facilities, Factories, and Labs

Data Science & Architecture Assessment

Data Science & Architecture Assessment

Micro-Model Machine Shop Simulator – Smart Factory

Micro-Model Machine Shop Simulator – Smart Factory

Automating Data Rights Understanding

Automating Data Rights Understanding

Text Analytics of PDF Technical Documents

Text Analytics of PDF Technical Documents

Expert Capture Webinars

Expert Capture Webinars

Data Engineering & Data Science

Data Engineering & Data Science

Paint Hangar IoT Monitoring

Paint Hangar IoT Monitoring

Pre-Flight Inspection AR Application

Pre-Flight Inspection AR Application

Expert Capture Maintenance Training Pilot

Expert Capture Maintenance Training Pilot

Navy Automated Data Cleansing with ML

Navy Automated Data Cleansing with ML

Automated Data Capture and Prediction

Automated Data Capture and Prediction

Automated Data Crosswalks

Automated Data Crosswalks

Contract Conversion & Analytics

Contract Conversion & Analytics

Database Tuning & Optimization

Database Tuning & Optimization

Augmented Reality Tools to Increase Workforce Productivity Across the Enterprise

Augmented Reality Tools to Increase Workforce Productivity Across the Enterprise

Decision Support for Cyber Hygiene

Decision Support for Cyber Hygiene

Augmented Reality Engineering Collaboration

Augmented Reality Engineering Collaboration

Big Data Ingestion & Cloud Architecture

Big Data Ingestion & Cloud Architecture

Cloud-Native Azure PaaS Architecture

Cloud-Native Azure PaaS Architecture

Azure Data Integration Hub Modernization

Azure Data Integration Hub Modernization

Data Warehousing & Business Intelligence

Data Warehousing & Business Intelligence

App Service Azure Infrastructure

App Service Azure Infrastructure

Big Data Engineering for Improved Analytics

Big Data Engineering for Improved Analytics

Agile Big Data Development

Agile Big Data Development

Cloud-Based Big Data Analytics

Cloud-Based Big Data Analytics

Supply Chain Predictive Analytics

Supply Chain Predictive Analytics

Cost Allocation Rules Engine Modernization

Cost Allocation Rules Engine Modernization

Data Services Cloud Migration Support

Data Services Cloud Migration Support

Predictive Analytics for the Aircraft Digital Thread

Predictive Analytics for the Aircraft Digital Thread

On-Demand Maintenance Analytics

On-Demand Maintenance Analytics

Algorithm Development & Text Analytics

Algorithm Development & Text Analytics

Machine Learning & NLP for Decision Support

Machine Learning & NLP for Decision Support

Sensor Data Analysis for Predictive CBM+

Sensor Data Analysis for Predictive CBM+

Cutting-Edge Responsive Design

Cutting-Edge Responsive Design

Data Cleansing and Migration

Data Cleansing and Migration

Application Modernization

Application Modernization

Large-Scale Data Integration

Large-Scale Data Integration

Data Science Big Data Ingestion

Data Science Big Data Ingestion

Modern Analytics Framework

Modern Analytics Framework

Agile Big Data Analytics Framework

Agile Big Data Analytics Framework

Big Data Hadoop Administration

Big Data Hadoop Administration

Modern Data Ingestion Framework

Modern Data Ingestion Framework

Performance Tuning & Best Practices

Performance Tuning & Best Practices

Engines Forecast Reporting Tool

Engines Forecast Reporting Tool

Augmented Reality Combustion Chamber & Gear Pump Disassembly

Optimization Using Hadoop

Data Quality & Lineage Mapping

Big Data Platform Analytics Outcomes

Modern Analytic Framework

Data Cleansing and Migration

Enterprise Data Warehousing

Value-Driven Analytics

Enterprise Data Exchange

Valuable Insight into Customer Shopping Behaviors

Interested In Working With Us?