Predictive maintenance
Datasets for Predictive Maintenance
This repository is intended to enable quick access to datasets for predictive maintenance (PM) tasks (under development). The following table summarizes the available features, where the mark \* on dataset names shows the richness of attributes you may check them up with higher priority. Note that RUL means remaining useful life. The project is written primarily in Jupyter Notebook, distributed under the MIT License license, first published in 2021. Key topics include: ai-engineering, anomaly, anomaly-detection, automation, condition-based-maintenance.
Predictive Maintenance
This repository is intended to enable quick access to datasets for predictive maintenance (PM) tasks (under development).
The following table summarizes the available features,
where the mark * on dataset names shows
the richness of attributes you may check them up with higher priority.
Note that RUL means remaining useful life.
| Timestamp | #Sensor | #Alarm | RUL | License | |
|---|---|---|---|---|---|
| ALPI* | x | 140 | CC-BY | ||
| CBM | x | 16 | 3 | Other | |
| CMAPSS | x | 26 | 2-6 | x | CC0: Public Domain |
| GDD | x | 5(1) | 3 | CC-BY-NC-SA | |
| GFD | x | 4 | 2 | CC-BY-SA | |
| HydSys* | x | 17 | 2-4 | Other | |
| MAPM* | x | 4 | 5 | x | Other |
| PPD | x | 25 | x | CC-BY-SA | |
| UFD | 37-52 | 4 | Other |
Installation
- Python=3.7
- pandas=1.1.2
Usage
Please put datasets directory into your workspace and import it like:
pythonimport datasets # Dataset-specific values will be returned datasets.ufd.load_data() # A visualization pdf will be generated datasets.ufd.gen_summary()
Each dataset class has the following functions:
load_data(index):
Dataset loading specified by 'index'.
Please see README.md in each dataset directory for more details.gen_summary(outdir):
PDF file generation for full dataset visualization.
Features
Run-to-Falure
Run-to-Falure data require:
- time column
- event/cencoring column (categorical)
- numerical/categorical feature columns (optional)
Notebooks
There are Jupyter notebooks for all datasets,
which may help interactive data processing and visualization.
References
Introduction to Predictive Maintenance
- Wikipedia:
https://en.wikipedia.org/wiki/Predictive_maintenance - Azure AI guide for predictive maintenance solutions:
https://docs.microsoft.com/en-us/azure/architecture/data-science-process/predictive-maintenance-playbook - Open source python package for Survival Analysis modeling:
https://square.github.io/pysurvival/index.html - Types of proactive maintenance:
https://solutions.borderstates.com/types-of-proactive-maintenance/ - Common license types for datasets:
https://www.kaggle.com/general/116302
Dataset Sources
- ALPI: Diego Tosato, Davide Dalle Pezze, Chiara Masiero, Gian Antonio Susto, Alessandro Beghi, 2020. Alarm Logs in Packaging Industry (ALPI).
https://ieee-dataport.org/open-access/alarm-logs-packaging-industry-alpi - CBM: Condition Based Maintenance of Naval Propulsion Plants Data Set
http://archive.ics.uci.edu/ml/datasets/condition+based+maintenance+of+naval+propulsion+plants - CMAPSS: NASA Turbofan Jet Engine Data Set:
https://www.kaggle.com/behrad3d/nasa-cmaps - GDD: Genesis demonstrator data for machine learning:
https://www.kaggle.com/inIT-OWL/genesis-demonstrator-data-for-machine-learning - GFD: Gearbox Fault Diagnosis:
https://www.kaggle.com/brjapon/gearbox-fault-diagnosis - HydSys: Predictive Maintenance Of Hydraulics System:
https://archive.ics.uci.edu/ml/datasets/Condition+monitoring+of+hydraulic+systems - MAPM: Microsoft Azure Predictive Maintenance:
https://www.kaggle.com/arnabbiswas1/microsoft-azure-predictive-maintenance - PPD: Production Plant Data for Condition Monitoring:
https://www.kaggle.com/inIT-OWL/production-plant-data-for-condition-monitoring - UFD: Ultrasonic flowmeter diagnostics Data Set:
https://archive.ics.uci.edu/ml/datasets/Ultrasonic+flowmeter+diagnostics
TODO
- Birkl, Christoph. Oxford Battery Degradation Dataset 1. University of Oxford, 2017.
https://ora.ox.ac.uk/objects/uuid:03ba4b01-cfed-46d3-9b1a-7d4a7bdf6fac - Lu, Jiahuan; Xiong, Rui; Tian, Jinpeng; Wang, Chenxu; Hsu, Chia-Wei; Tsou, Nien-Ti; Sun, Fengchun; Li, Ju (2021), “Battery Degradation Dataset (Fixed Current Profiles&Arbitrary Uses Profiles)”, Mendeley Data, V2.
https://data.mendeley.com/datasets/kw34hhw7xg/2 - One Year Industrial Component Degradation
https://www.kaggle.com/inIT-OWL/one-year-industrial-component-degradation - Vega shrink-wrapper component degradation
https://www.kaggle.com/inIT-OWL/vega-shrinkwrapper-runtofailure-data - NASA Bearing Dataset:
https://www.kaggle.com/vinayak123tyagi/bearing-dataset - CWRU Bearing Dataset:
https://www.kaggle.com/brjapon/cwru-bearing-datasets
License
All the matrials except for datasets is available under MIT lincense.
I preserve all raw data but atatch data loading and preprocessing tools
to each dataset directory so that they are quickly used in Python.
Each dataset should be used under its own lincense.
Contributors
Showing top 2 contributors by commit count.
