NASA reaches out to Graph DB to find people and skills for lunar and Mars missions • The Register


NASA has set itself the ambitious goal of getting people back to the moon by 2024 in order to make the leap in magnitude to Mars later. It’s a struggle even for the world-renowned space agency.

According to a recent report from the Government Accountability Office (GAO), it is not yet entirely certain how much this will cost and how long it will take to get the equipment ready. Hardware challenges abound, including testing an electric ion propulsion system for the orbiting capsule, named goal.

On top of that, the question arises whether NASA has the right people to take over the moon mission and the trip to the red planet beyond. Looking for answers is David Meza, a senior data scientist in the agency’s People Analytics group who has started using graph database technology to help.

Like many HR and personal IT companies, NASA maintains a mix of applications and analytics systems, each of which rely on relational databases.

“We have SAP, ServiceNow, and some of the different RDBMS that have different types of information about the employees, our job roles, our jobs, and so on,” Meza said.

But connecting the information between the systems and understanding the relationships between them was the hard part. Data on individuals can reside on one system, while training courses are managed on another and databases on projects reside elsewhere. “It made it difficult to find the right relationship very easily,” he said.

Step forward graph databases. Meza has worked with graph databases for more than 10 years while serving as Chief Knowledge Officer at NASA, applying them to a lessons learned database problem.

The same functions would help with the challenge NASA faces: “This is a graph problem,” he said.

The aim was to find the relationships between descriptions of knowledge, skills, abilities, tasks and technologies (KSATTs) and professions, roles and training. Bringing these relationships together would help the organization find its gaps, weaknesses, or strengths, he said.

NASA is under pressure to make progress in understanding the skills of the people it directly employs: around 1,700 people. In its report, the GAO said:

We noticed that NASA has made some progress in addressing the workforce challenges … but the Board of Directors has other challenges to overcome. For example, when assessing the program status, it was found that the Mission Directorate had key actors in acting roles and many vacancies were identified, particularly in the Advanced Exploration Systems Department. The Directorate has made progress in filling a number of these posts on a permanent basis. In the AES division, however, eight of the 25 management positions were still held on a temporary basis as of December 2020.

But Meza’s team’s challenges go beyond the high-profile missions to return to the moon and on to Mars.

“We have other missions: we’re trying to support geosciences and climate change. We work a lot in the aerospace industry. We also do a lot of software and technology development that we provide, as well as our medical information that we get from experiments we do on the International Space Station, ”he said.

In building a graph system to understand the people, roles, and skills it would take to overcome such scientific and engineering challenges, Meza turned to Neo4j, with whom he had worked for many years in previous NASA roles.

He said the vendor’s labeled property graph approach was “more intuitive” than competitor TigerGraph’s preferred resource description framework. “For me, it made relationships and connections easier. In my opinion, my SQL background made it easier for me to see these relationships in a label property diagram model. “

He also said that Neo4j technology, like data science algorithms, are attractive.

In approaching a problem, Meza has broken it down at the most basic level into information about knowledge, skills, and tasks that may each be present in a job or role, person, or education.

To build the information corpus on job-related skills, the team used a database called O * NET from the US Department of Labor and ESCO, the European Skills, Competences, Qualifications and Occupations Database.

To understand relationships between data sets in training manuals, project papers and résumés or résumés as well as in the corpus of competence information, Meza used Doc2Vec and an extension of the Word2Vec NLP algorithm developed by Tomas Mikolov at Google and published in 2013. .

Word2vec expresses a document as a multidimensional vector so that it can be compared with other documents from other databases. It enables the algorithm to compare a paragraph describing knowledge with a paragraph on a resume or job description.

“We still have to validate and verify it, so we’re working on it with our industrial and organizational psychologists,” said Meza.

Staff validate the model themselves, in part to improve model training and help them understand the program.

The team – Meza plus three people – is planning to set up the interface so that employees can start validating their profiles and adding information in the summer.

Managers have access to it to build a recommendation engine that allows people to browse the various opportunities at NASA and find the ones that best suit their knowledge and skills. The Neo4j back-end database is provided on the Google Cloud Platform.

Based on an annual budget of around $ 150,000, Meza hopes spending on the project with its use cases will increase.

“As we move into full-scale production and demonstrate the value of these skills, I am already getting inquiries from individuals asking how we can extend this to different data sources. So I can imagine that this budget will grow significantly in the next year, ”said Meza.

Whether it will help NASA get to the Moon and then Mars by 2024 is not entirely clear. But maybe it’s just a small step in the right direction. ®


Leave A Reply