2022 Broader Impact Highlights
Data Curation and Discovery – Making Experiments and Discovery Remotely Accessible and Actionable

What Has Been Achieved: 2DCC’s web-based facility-wide data management system has captured the full history individual samples with over 5000 materials synthesis samples and over 6000 ex-situ characterizations (e.g., AFM, STM, TEM, XRD, XPS, among many others).  This extensive  and growing database forms the impetus for new, creative research through the data request mechanism whereas remote users can open new lines of experimental or data science inquiry through packaging of certain synthesis and/or characterization data.

LiST is connected to the software executing the growth recipe in order to capture metadata about the sample (as the substrate used and the project a sample is grown for). That data can then be combined with both the growth recipe as developed by the researchers as well as growth conditions that synthesis instruments save into their log files. Samples are created in LiST automatically after each instrument run. This ensures that LiST not only captures the "good" samples, but also failed growth runs, adding to the value of that database for machine learning and automatization. In order to ensure that LiST can capture also the further history of samples, it has been integrated into the staff workflow. LiST is used to share data with users, to organize sample shipping, user projects and 2DCC relevant publications.

Importance of the Achievement: Users can visualize and explore their data through a web-based user interface especially important for remote users contributing to the guidance of their experiment with onsite 2DCC staff.  This is critical not only during pandemic times but also for broadening participation in any circumstance for those who cannot attend onsite, including those doing online training on synthesis or characterization techniques. After projects end, 2DCC contributed data will be made accessible to the community via the LiST user interface. The knowledge graph work in LiST 2.0, made possible by the robust underlying database and its structure, will enable new modes of inquiry for data science applications not previously possible.

Unique Feature(s) of the MIP that Enabled this Achievement: 2DCC’s combined platform focus on both knowledge sharing and materials discovery as applied to external users and in-house research enables investments in making data not only accessible but also empowering remote users.  This approach is the impetus for fueling new lines of inquiry in data science that contribute to accelerating materials discovery.