SDMX Metadata AI Assistant (MAIA): enhancing statistical metadata

SDMX Metadata AI Assistant (MAIA): enhancing statistical metadata

The SDMX Metadata AI Assistant (MAIA) is an open-source tool specifically designed to address the challenges of managing and editing statistical metadata. Leveraging AI, SDMX MAIA simplifies metadata workflows by automating key tasks such as checking syntax, consistency checks and formatting adjustments.

What is the SDMX Metadata AI Assistant (MAIA)?

The SDMX Metadata AI Assistant (MAIA) is an open-source tool specifically designed to address the challenges of managing and editing statistical metadata. Leveraging AI, SDMX MAIA simplifies metadata workflows by automating key tasks such as checking syntax, consistency checks and formatting adjustments. Fully compliant with the Statistical Data and Metadata eXchange (SDMX) standard, MAIA ensures seamless integration into statistical pipelines, supporting the efficient exchange and harmonisation of metadata.

Use Case

The SDMX Metadata AI Assistant (MAIA)’s primary use cases include enhancing the quality and readability of metadata. Users can upload their data through SDMX-ML files along with their Data Structure Definitions (DSDs), select attributes for editing and allow MAIA to automatically validate and refine the content. By offering features such as customisable AI assistants, interactive before/after comparisons and dynamic reporting in SDMX-ML format, MAIA empowers statisticians to maintain high-quality metadata while saving time and resources. Whether addressing metadata consistency issues, aligning metadata with statistical frameworks or preparing data for dissemination, MAIA is a robust and user-friendly solution for modern statistical pipelines.

SDMX MAIA is designed with the following features:

📝 Metadata editing

  • Perform grammar, syntax, and consistency checks to ensure high-quality metadata
  • Streamline processes to improve compilers’ efficiency

🤖 Custom AI assistant

  • Configure your own AI assistant with customisable knowledge bases and instructions
  • Tailor the model to meet specific user requirements

🔗 End-to-end SDMX integration

  • Seamlessly input metadata from SDMX using pysdmx
  • Generate outputs in SDMX-ML 2.1 format

📊 Transparency

  • Automatically save each run as a JSON file
  • Maintain a clear and auditable record of all executions

🔍 Before/after interactive comparison

  • Easily compare in interactive mode the metadata changes before and after edits
  • View results in both terminal and HTML formats for flexibility

🖥️ Streamlit UI

  • An intuitive, web-based workflow execution interface
  • Designed for application designers to simplify processes

📊 Run status and metrics

  • Display logging and performance metrics directly in the UI
  • Save detailed logs to logs/ and performance metrics to metrics_performance.json for further analysis

This application is fully developed in Python and is built upon four key dependencies:

  • Streamlit: Powers the interactive and intuitive web interface
  • Pysdmx: Facilitates the handling and retrieval of metadata from SDMX sources
  • LangChain: Integrates AI workflows, enabling seamless interaction with language models
  • OpenAI: Provides access to OpenAI’s advanced AI assistant models for enhanced natural language processing capabilities

Using SDMX MAIA involves the following workflow:

  1. Upload and configure in three simple steps:
  • Upload SDMX files, including the data and the related Data Structure Definition in SDMX format;
  • Select the attribute to modify available from the list, for example “title”, “compilation” or “methodology”;
  • Configure technical settings, including the OpenAI assistant, API keys and other preferences.
  1. Customise the AI assistant:
  • Customise the AI Assistant by providing specific instructions and uploading any additional files to enhance the model’s knowledge base;
  • To ensure agility and fully leverage OpenAI’s features, this step is performed on the OpenAI platform.
  1. Review and finalise:
  • Once the editing process is complete, the application provides an interactive interface to review changes;
  • The diff-and-diff feature highlights additions in green and deletions in red. Users can directly edit, overwrite, or revert changes for each modified metadata item;
  • Finally, the application generates an updated SDMX file, ready to be ingested back with the new metadata.

Learn More:

For more details, check the project documentation (external link) .