Internal Displacement Monitoring Centre (IDMC)
The Internal Displacement Monitoring Centre (IDMC) is the world's definitive source of data and analysis on internal displacement. Since our establishment in 1998 as part of the Norwegian Refugee Council (NRC), we have offered a rigorous, independent and trusted service to the international community. Our work informs policy and operational decisions that improve the lives of the millions of people living in internal displacement, or at risk of becoming displaced in the future.
The purpose of this position is to support the consolidation and disaggregation of historical IDMC data. For this purpose, the consultant will work in collaboration with a Monitoring Expert consultant and will report to IDMC's information management advisor.
The Data Scientist will support the consolidation and disaggregation of historical IDMC data. This will involve analyzing historical figures, performing a gap analysis, filling data gaps by analyzing PDF and Excel files of the original source of the figures, and geolocating the data based on the information provided by the original source.
As a Data Scientist Consultant, you will be responsible for the following tasks:
- Analyzing historical figures to identify data gaps within IDMC's records.
- Conducting a comprehensive text analysis of external databases (e.g., ReliefWeb, HDX, EM-DAT, GLIDE) that publish data in various formats, such as APIs, PDFs, and Excel files. The primary objective is to identify and match missing data in IDMC's historical figures.
- Extracting essential information, including dates, locations, event names, and other metadata, from the source documents of figures to fill data gaps in IDMC's historical data.
- When possible, disaggregating IDMC's figures by location and date, based on information available in the source documents.
- Identifying suitable candidates within the external databases to fill data gaps in IDMC figures, metadata, and source documents (documents containing the information published in IDMC's figures).
- Implementing automated quality control measures to ensure the reliability of the data gaps filled. Logical rules will be applied for analyzing the compiled information.
- Consolidating historical data to facilitate the revision process of the Monitoring Expert consultant.
- Improving the methodology workflow based on feedback received from the Monitoring Expert consultant.
- Delivering a consolidated historical dataset to IDMC's Information Manager Advisor, ensuring, as much as possible, there are no data gaps.
- Documenting the gap analysis and methodologies used to fill data gaps on IDMC's GitHub repository.
- Transferring the documented code and any additional materials necessary to achieve the consultancy's goals to IDMC's Information Manager Advisor.
Through these tasks, the consultant will play a crucial role in enhancing the accuracy and completeness of IDMC's historical data, contributing to improved data analysis and informed decision-making processes.
The goal of this consultancy is to enable effective consolidation and disaggregation of historical IDMC data through comprehensive data analysis, geographical disaggregation, hazard matching, and the development of an appropriate workflow to assign P-codes to geolocated data. The consultancy aims to rectify data gaps, streamline data management processes, and enhance the organization's ability to accurately correlate and interpret historical data.
CONSULTANCY TIME and IMPLEMENTATION TIMEFRAME
Starting date: 1.11.2023 – ending 31.01.2024.
Duration: 5 months for approximately 100 days.
The consultant should be available to work almost full-time between the period September to January.
This is a home-based consultancy, with regular interaction with teleconferences and possibly some face-to-face meetings.
Key Deliverables include:
Deliverable 1: Gap analysis report
- Consolidated baseline historical data (2001 to 2015)
- Gaps analysis report
Deliverable 2: Filling data gaps (data preprocessing, feature extraction and entity recognition)
- Implemented methodology to candidate generation of variables of interest (Ex. Dates, location, figure source document) from external databases (Ex. ReliefWeb, HDX, EM-DAT, GLIDE).
- Implementing methodology to align or matching data from different sources.
- Consolidated dataset with matching candidates of variables of interest and figure source documents.
- Based on the validation of results conducted by the Monitoring Expert consultant, adjustments into the dataset may be suggested.
Deliverable 3: Data cleaning
- The consultant will consolidate and prepare a consolidated historical data set and clean the data for analysis.
- With the support of the Monitoring Expert consultant the consultant will document the data cleaning steps, the tools used for data cleaning.
Deliverable 4: Consolidated and disaggregated dataset of historical IDMC data
- Handover of the consolidated dataset to IDMC
- Final documentation of the process of consolidating the historical IDMC data. The documentation will include a description of the data consolidation steps, the tools used for data consolidation, code, and the results of the data consolidation.
- This deliverable will include the handover of the code and documentation used to complete the project.
- The code will be well-documented and easy to understand.
The consultant will be line-managed by the Information Management Adviser.
SKILLS and QUALIFICATIONS
- Degree in data science or related field.
- 2 years of experience in data analysis and data mining.
- Strong proficiency in Python, R, or other programming languages.
- Strong knowledge of GitHub and Jupiter notebooks.
- Strong understanding of Natural Language Processing (NLP) methods.
- Experience with geographical data and geolocation methodologies.
- Excellent problem-solving and analytical skills.
- Ability to work independently and as part of a team.
- Excellent written and verbal communication skills.
Consultants who meet the above requirements are invited to submit an expression of interest by Monday 9 October 2023, 23:59 CET to Maria Teresa Miranda (email@example.com) and include the following as part of their application:
- Curriculum Vitae;
- Contact information including the consultant’s name, email address and phone number(s);
- Proposed daily consultancy rate in CHF or EUR, inclusive VAT and all charges;
- List of references that can be contacted to verify the quality of services;
- Proof of registration as a sole trader/registered company.
Please note: All service providers/consultants working with NRC should maintain high standards on ethical issues, respect and apply basic human and social rights, ensure non-exploitation of child labour, and give fair working conditions to their staff. NRC reserves the right to reject quotations provided by suppliers not meeting these standards. Consultants doing business with NRC will be screened on anti-corruption due diligence before NRC confirms a contract.