We are happy to announce the release of AI-based detection of duplicate defect reports in Railfleet.
On average, 10% of reported corrective defects are duplicates, which can compromise maintenance data quality and create unnecessary work. The AI-based feature assists users in detecting duplicates to improve data quality and ensure accurate maintenance processes.
The benefits are safer and more efficient maintenance collaboration between ECM 3 and ECM 4 teams and a facilitated return on experience (REX) analysis for ECM2 maintenance engineers.
Why focus on duplicate detection?
Data Quality: 10% of reported defects are duplicates and it is important to improve data quality between ECM3 and ECM4 teams.
Maintenance Efficiency: Eliminating duplicates reduces extra work and potential safety issues.
A good use case for AI: Large Language Models (LLMs) are good at classifying and comparing information accurately in multiple European languages.
It can be safely built: AI flags potential duplicates but leaves final decisions to users, ensuring no interference during maintenance event creation and management.
It meets data protection standards: Customer data is only used for fine-tuning and inference under an enterprise licence, a technique that prevents disclosing data to LLMs providers for training purposes, and the duplicate detection AI feature is GDPR compliant.
How does duplicate detection work?
When a new defect is added, or an existing defect description is updated, the system will check for similar open defects. If a match is found, the created defect will be marked as a potential duplicate, linking to the similar defect. The user can then review and confirm whether the defects are duplicates or not.
All similar events can be seen in one of the duplicate defects detected (as displayed below).
The detection result can be reviewed and confirmed (or not) by any user with editing rights. Confirmed duplicate defects will be displayed in each of the defects detail views.
Confirming that two defects are duplicates will NOT delete any information on the platform. This information is an extra indication that can be used in future analysis. It can be reverted at any time by the user.
How well does the AI work on our test benchmark?
Our duplicate detection tool is designed to minimize the burden on our customers by maintaining a false positive rate below 10%, while still detecting the majority of duplicate defect reports. With your feedback, we aim to further improve this feature over time.