Post a job

Job has expired

This job post is expired and is no longer taking new applicants.

Return home Find similar jobs

2024-0064 Explainable AI - FRI 16 Feb

EMW, Inc. logo

Location
Belgium only
EMW, Inc.

Job Description

Deadline Date: Friday 16 February 2024

Requirement: Explainable AI

Location: OFF-SITE

Note: Please refer to your Subcontract Agreement, article 6.4.1.a, which states “Off-Site Discount: 5% (this discount is applicable to all requirements, and applies when the assigned personnel are permitted to work Off-Site, such as at- home)". Please be sure to price this discount in your overall price proposal when submitting bids against off-site RFQs

Required Start Date: 18 March 2024

End Contract Date: 31 December 2024

Required Security Clearance: NATO UNCLASSIFIED

1 Introduction

1.1 Purpose

The NATO Communications and Information Agency (NCIA) is NATO’s single Information Technology service provider organization. The Agency's Innovation group supports the analysis and exploration of digital technologies with the potential to make a major positive difference to NATO. As a part of its institutional mission, NCIA is always looking for to new technologies and cutting edge solutions to be implemented in support of the Alliance.

The purpose of this statement of work (SoW) is to describe a study to be executed by the subject matter experts to analyse the possibility to create a solution for automatic suggestion of the classification of NATO text documents (Microsoft Word documents, PDFs, e-mails). This will enable to prepare proper confidentiality and metadata labels using NATO standardization agreements (STANAG 4774, 4778, and 5636) that support implementation of the Data Centric Security and, in the long term, increase NATO’s cyber security posture.

The overarching goal of the DCS concept is to maintain and enhance Information Superiority of the Alliance by increasing efficiency and effectiveness of enforcement of NATO security policies and directives related to protection, control and sharing of information.

Information superiority cannot be achieved without efficient information sharing mechanisms. What is that precludes information sharing in a Federated context? Manual labelling and release of information is a major impediment in seamless flow of data, especially in time–constrained dynamic operational scenario. Services commonly used to develop and sustain situational awareness, command and control, logistics, supply chain, etc., usually generate data elements at machine speeds, making it impossible for human to label individual data elements. Moreover, there are lots of legacy unlabelled data that need to be labelled in order to be shared in a coherent manner.

DCS must enable mission partners to maximize the availability of releasable information to authorized partners and decision-makers across all levels of the command structure, from: strategic to tactical, and sensor to effector.

Suggesting confidentiality and metadata label contents must however be an auditable process, non-repudiation of which can be guaranteed. It is therefore recommended that the methods used in this work focus on the ability to explain the rules used to suggest confidentiality in case of particular sentence/paragraph/document.

The Agency is seeking a supplier to study the described use case, analyse the possibility to use machine learning methods, including Large Language Models (LLMs) and assess their applicability to creating a model for defining confidentiality labels for documents using explainable AI.

2 Requirements

The contract that will be granted in relation to the present SoW will:

  1. Perform the analysis, concept development, model creation, coding, writing of the report, holding external discussions and reporting progress.

a) The contractor shall analyse the possibility to use machine learning methods, including Large Language Models (LLMs) and assess their applicability to creating a model for defining confidentiality labels for documents using explainable AI.

b) The contractor shall analyze the input dataset received from the NCI Agency as the exemplary dataset consisting of documents. The set will contain a subset with paragraph-wide and document-wide annotations pointing to their security label.

c) The contractor shall develop a mechanism to extract metadata from the documents.

d) The contractor shall identify a set of rules determining confidentiality level of the document based on the provided document set. The aim of this step is to propose a set of rules large enough to evaluate the feasibility of the automated methods of document classification.

e) The contractor shall apply the rules to the test document set. It is expected that the Large Language Models (LLMs) will be used to facilitate language comprehension.

f) The contractor shall demonstrate the feasibility of using AI methods (including LLMs) for automatic and semi- automatic (user assisted) classification of documents.

  1. Develop & Deliver the Technical Report describing:

a) Part 1: Solutions analyzed and methodology of work.

b) Part 2: Architecture and deployment of the solution.

c) Part 3: Verification of results.

d) Part 4: Installation guide and manual together with necessary code (e.g. in Python) and Jupiter Notebooks.

  1. Take part in the project meetings and document the work done.

  2. Perform basic dissemination activities with NCI Agency staff.

The dataset consisting of documents used for the analysis will be delivered by the NCI Agency to the contractor as soon as the contract is granted.

The contractor is therefore required to:

  1. Propose metadata extraction mechanism:

a) That would include supervised learning methods and guided extraction, both aimed at paragraph-wise metadata retrieval.

b) That would be able to identify:

i) Document Details, e.g.: classification level (if available), id, type, date and language.

ii) Authorship Information, e.g.: author information and originating department or agency.

iii) Content Information, e.g.: main subject/topic, keywords, extracted entities, or specific names.

iv) Geographic Data, e.g.: specific location or region mentioned in the document.

v) Mentioned Parties, e.g.: details related to the associated entities mentioned in the document.

vi) Related Documentation, e.g.: references to associated, linked, or referred documents.

vii) Other metadata according to STANAG 5636 and 4774.

The STANAGS 5636 and 4774 will be delivered by the NCI Agency to the contractor as soon

as the contract is granted.

  1. Propose a set of rules for determining confidentiality levels of the documents

a) The rules’ set should enable to evaluate the overall proposed model for automated suggesting metadata and security labels.

b) The rules should be defined on varying levels of abstraction and complexity, including: use of extracted metadata, specific sensitive terms, phrases, or context requiring semantic understanding of the text.

  1. Automatically apply rules to the test set

a) In order to assess the confidentiality level of the sentences/paragraphs/whole documents

b) In every step preserve high level of transparency and accountability.

c) Evaluate system performance and gather user experience.

  1. NCI Agency staff will provide oversight and inputs that shall be taken into account in the conduct of the work but which shall not waive the contractor’s responsibility for the deliverables.

2.1 Schedule of Deliverables

The contractor shall undertake the necessary activities to provide the following deliverables by their respective due dates.

Primary output D1: Technical Report – version 1. Description of the solutions to be analysed and methodology of work. Metadata extracted from the data set.

Due Date 30.04.2024

Primary Output D2: Technical Report – version 2. Description of the rules generated.

Due Date: 31.05.2024

Final output D3: Final version of the Technical Report. Code, executables and code documentation; Evaluation of system’s performance on a new dataset and demonstration of the feasibility of using AI methods (including LLMs) for automatic and semi- automatic (user assisted) classification of documents (Word file). By the time of delivery all related documentation should be submitted & dissemination activities conducted.

Due Date: 15.07.2024

Reports and documentation must be written in English and delivered in Microsoft Word docx format.

2.2 Payment Schedule

The payments for subsequent deliverables after their successful delivery will match the following schedule. Successful delivery of the outcomes will be confirmed by signing document produced according to Annex A: Delivery Acceptance Sheet Template.

Associated Deliverables: D1

Deliverable Due Date: 30 April 2024

Payment value (% from total contract value): 30%

Associated Deliverables: D2

Deliverable Due Date 31 May 2024

Payment value (% from total contract value): 30%

Associated Deliverables: D3

Deliverable Due Date: 15 July 2024

Payment value (% from total contract value): 40%

Comments: The payment shall be dependent upon successful acceptance of the Delivery Acceptance Sheet

2.3 Coordination

It is anticipated that coordination between the NCI Agency and the contractor will take place electronically and that there will be no requirement for the contractor to visit the Agency.

2.4 Security Clearance

NATO security clearance is not required by the contractor.

3 Evaluation

This offer will be evaluated using best value method, using a combination of cost and technical scope.

3.2 Criteria for technical scope

[See Requirements]

Requirements

3.2 Criteria for technical scope

This offer will be evaluated against the following criteria:

• Proposed approach to study and methodology, including project-related requirements:

  • o Automatic application of rules for assessing the confidentiality levels of individual paragraphs.
  • o Using Large Language Models (LLMs) to augment language comprehension.
  • o Providing transparency and explainability of confidentiality assessments.
  • o Providing deliverables including: a codebase for data parsing, metadata extraction, and confidentiality assessment including rule application and explanation.
  • o The explainability of the rules applied to classify parts of the document is key in realisation of this work.
  • o The system must possess the capacity to display the results of confidentiality classification on a collection of documents reserved from the training process.

• Contractor experience of operation on the market in the field on AI, denoted by:

  • o A business tenure of no less than five years in the relevant market.
  • o The employment of a contingent of no fewer than twenty specialists in the field of Artificial Intelligence.

• Contractor experience and track record of delivering AI-based solutions, encompassing:

  • o At least one commercial project related to document analytics, such as key information retrieval.
  • o At least one commercial project in the field of Natural Language Processing, including named entity recognition and intent recognition.
  • o At least one commercial project involving fine-tuning open-source LLMs.

• The Contractor should have a demonstrated capability to implement a GUI application with an LLM-based solution, inclusive of explainability functionalities such as valid summarizations and citations.

Advice from our career coach

Are you an AI expert with a knack for explainable AI? The NATO Communications and Information Agency (NCIA) is seeking a supplier to analyze the possibility of using machine learning methods, including Large Language Models (LLMs), to create a model for defining confidentiality labels for documents using explainable AI. Your role would involve analyzing input datasets, developing metadata extraction mechanisms, proposing rules for determining confidentiality levels, and applying those rules to test sets. The goal is to improve information sharing mechanisms and enhance NATO's cyber security posture. If you have experience in AI and document analytics, this is the perfect opportunity to showcase your skills. So, join us in this cutting-edge project and help shape the future of NATO's information superiority!

Apply for this job

Expired?

Please let EMW, Inc. know you found this job with RemoteJobs.org. This helps us grow!

RemoteJobs.org mascot