The goal
A development tool to help designing and documenting RDF graphs.
Context
I am currently doing a lot of work with RDF graphs. This involves dealing with many publicly avaialble ontologies and some custom internal ones; writing SHACL shapes to encode, validate and document how the data should be represented in the data graph by combining terms from these ontologies together; and in addition I am also dealing with mappings between tabular data in relational databases with virtual graphs.
For each of these subject I use some different tools to visualize my work in my browser. And I can publish the HTML produced by these tools so I can communicate to my colleagues who are workign on different aspect of the data graph, and they can use the same tools to document their work with me.
Unfortunately I have some major problems with these tools:
- Individual limitations: visualizations do not scale; too verbose; not verbose enough; lack support for some RDF features; etc.
- Fragmentation: I need to use multiple tools, to create multiple websites, of different quality for different aspects.
- Scale limitations: all these tools work for a singular ontology, singular controlled vocabulary, singular applciation profile; not for collections
And an overall problem: I cannot hack them all together to try to provide this unified view.
Scope
This project is for me first, and hopefully it might help other people doing the same kind of work.
Some constraints:
- Needs to run fast so it can be used in the iterative process of developing shapes and ontologies
- Very limited number of moving parts
- No web development, no javascript and no dynamic application
- No database
- Re-use as much as possible existing tech
- Provide text and diagrams
Early choices
My main idea for this project is to create a sort of compiler that takes some RDF files in input and produces something I can feed to a static website generator.
graph LR
shacl[SHACL shapes]
ontologies[Ontologies]
sparql_examples[SPARQL examples]
mappings[Virtual graphs mappings]
workspace((compiler))
input_static_web_gen[input for static website generator]
shacl --> |input| workspace
ontologies --> |input| workspace
sparql_examples --> |input| workspace
mappings --> |input| workspace
workspace --> |output| input_static_web_gen
I plan to use Rust. My main reason is that this is the only high-level compiled programming language I know and I am comfortable with. I like type systems, I like rust-analyzer and there is already a crate/ecosystem to deal with the parsing of RDF files and able to support queries: Sophia.
One of the main idea is to be able to delegate the rendering to other software. I could try to make something very generic and customisable with everyone bringing their template files. But for now my main "client" is myself. I am going to pick a single static website generator and stick to it, then make something more generic, if needed. I guess I will start with Zola (so markdown would be the output), because I use it for this blog, so I already have a bit of experience with it.
Next steps
The next steps will involve more planning. I need to learn how to use the sophia crate and I need to prototype some visualizations and documentation to then go backward from it.
Regarding diagrams I am not sure yet what tool I will use. I can decently use Mermaid or PlantUML. But before making a choice I need to prototype and make sure I can at some point support RDF-star and reification. Documenting both is a major pain points with various tools I tried to use before, and those are major things I need to document properly for people using my data graph.
The idea of generating intermediary file for a static website generator might be a bad idea for multiple reasons. I might need to only target a specific version of this generator, a specific version of a theme for this generator. Once I get the ball rolling with this project I might need to reconsider my options without hopefully having to completely refactor my codebase.