Text Analytics of PDF Technical Documents

Customer Challenge

The Air Force required a logistics data crosswalk to mitigate known maintenance and supply data connection challenges limiting accurate demand planning and forecasting.

Innovative Solution

ILW data scientists used natural language processing (NLP) and unsupervised machine learning (ML) techniques to evaluate and determine an automated method to tie Work Unit Code (WUC) to related National Item Identification Numbers (NIINs). They used information extracted from Technical Orders in native PDF format as well as data captured in maintenance and supply data systems.

Benefits/Outcomes

  • Extracted master parts list (MPLs) for two Air Force weapon system programs
  • Developed multiple table extraction techniques that read PDF documents and pull tabular information out with high degrees of accuracy. Techniques leverage and improve open-source libraries
  • Provide enterprise search capability of Air Force technical documents

Business Value

  • Improves parts supportability, contract lead times, integrated repair planning
  • Enables planning for predictable shifts in demands and condemnations, buying the right quantities of the right parts, avoiding overbuy on other parts

Toolbox

  • Open-source Python solution using DoD-compatible libraries: Pandas, Tabula and Fitz, Scikit-learn, and OpenCV
  • Native PDFs
  • Text analytics, NLP, Machine Learning

Related Case Studies You May Like

Webinar: Blackhawk Helicopter Preflight Inspection Augmented Reality Application

Webinar: Blackhawk Helicopter Preflight Inspection Augmented Reality Application

Webinar: Paint Hangar Environmental Monitoring

Webinar: Paint Hangar Environmental Monitoring

Webinar: Making the Most of Your Sensors — Smart Facilities, Factories, and Labs

Webinar: Making the Most of Your Sensors — Smart Facilities, Factories, and Labs

Data Science & Architecture Assessment

Data Science & Architecture Assessment

Micro-Model Machine Shop Simulator – Smart Factory

Micro-Model Machine Shop Simulator – Smart Factory

Automating Data Rights Understanding

Automating Data Rights Understanding

Text Analytics of PDF Technical Documents

Text Analytics of PDF Technical Documents

Expert Capture Webinars

Expert Capture Webinars

Data Engineering & Data Science

Data Engineering & Data Science

Paint Hangar IoT Monitoring

Paint Hangar IoT Monitoring

Pre-Flight Inspection AR Application

Pre-Flight Inspection AR Application

Expert Capture Maintenance Training Pilot

Expert Capture Maintenance Training Pilot

Navy Automated Data Cleansing with ML

Navy Automated Data Cleansing with ML

Automated Data Capture and Prediction

Automated Data Capture and Prediction

Automated Data Crosswalks

Automated Data Crosswalks

Contract Conversion & Analytics

Contract Conversion & Analytics

Database Tuning & Optimization

Database Tuning & Optimization

Augmented Reality Tools to Increase Workforce Productivity Across the Enterprise

Augmented Reality Tools to Increase Workforce Productivity Across the Enterprise

Decision Support for Cyber Hygiene

Decision Support for Cyber Hygiene

Augmented Reality Engineering Collaboration

Augmented Reality Engineering Collaboration

Big Data Ingestion & Cloud Architecture

Big Data Ingestion & Cloud Architecture

Cloud-Native Azure PaaS Architecture

Cloud-Native Azure PaaS Architecture

Azure Data Integration Hub Modernization

Azure Data Integration Hub Modernization

Data Warehousing & Business Intelligence

Data Warehousing & Business Intelligence

App Service Azure Infrastructure

App Service Azure Infrastructure

Big Data Engineering for Improved Analytics

Big Data Engineering for Improved Analytics

Agile Big Data Development

Agile Big Data Development

Cloud-Based Big Data Analytics

Cloud-Based Big Data Analytics

Supply Chain Predictive Analytics

Supply Chain Predictive Analytics

Cost Allocation Rules Engine Modernization

Cost Allocation Rules Engine Modernization

Data Services Cloud Migration Support

Data Services Cloud Migration Support

Predictive Analytics for the Aircraft Digital Thread

Predictive Analytics for the Aircraft Digital Thread

On-Demand Maintenance Analytics

On-Demand Maintenance Analytics

Algorithm Development & Text Analytics

Algorithm Development & Text Analytics

Machine Learning & NLP for Decision Support

Machine Learning & NLP for Decision Support

Sensor Data Analysis for Predictive CBM+

Sensor Data Analysis for Predictive CBM+

Cutting-Edge Responsive Design

Cutting-Edge Responsive Design

Data Cleansing and Migration

Data Cleansing and Migration

Application Modernization

Application Modernization

Large-Scale Data Integration

Large-Scale Data Integration

Data Science Big Data Ingestion

Data Science Big Data Ingestion

Modern Analytics Framework

Modern Analytics Framework

Agile Big Data Analytics Framework

Agile Big Data Analytics Framework

Big Data Hadoop Administration

Big Data Hadoop Administration

Modern Data Ingestion Framework

Modern Data Ingestion Framework

Performance Tuning & Best Practices

Performance Tuning & Best Practices

Engines Forecast Reporting Tool

Engines Forecast Reporting Tool

Augmented Reality Combustion Chamber & Gear Pump Disassembly

Optimization Using Hadoop

Data Quality & Lineage Mapping

Big Data Platform Analytics Outcomes

Modern Analytic Framework

Data Cleansing and Migration

Enterprise Data Warehousing

Value-Driven Analytics

Enterprise Data Exchange

Valuable Insight into Customer Shopping Behaviors

Interested In Working With Us?