Skip to content

Commit

Permalink
Feature: Add URI fetching related triples and serialization in differ…
Browse files Browse the repository at this point in the history
…ent formats (#125)

* Add raptor library to parse ntriples data

* Add resource model to fetch id related triples and serialize it

* Add and inhance xml, ntriples, turtle and json serializers

* Updating rdf version in goo project

* updating resource model

* Adding tests for resource model and serializers

* update the resource test to have a more complete data to test (array, bnodes, typed values)

* re-implement xml serializer using RDF/XML parser instead of Raptor

* implement array handelling of resource to_object

* Enhance and refactor serializers ntriples, turtle and xml

* Enhance and refactor serializers ntriples, turtle and xml

* Handle blank nodes and reverse triples
- handle blank nodes
- fetch reverse triples
- generate random name for models in to_object, because when two model created the same time one overrides the other
- call the new serializer JSONLD and RDF_XML

* Impliment new serializers jsonld and rdf_xml

- impliment jsonld serializer that uses json-ld library
- revert changes in xml.rb file to the original implimentation, and put the new implimentation in rdf_xml.rb file
- Add the media types :jsonld and :rdf_xml

* Add json-ld gem

* Enhance the test resource

- Add some cases to the data tests
- refactor the test of the serializers formats

* Fix test for fetch-related triples and json

* clean and refactor the resource serializer code

* Removed unused methods
* Extracted duplicated code in methods
* Removed skip from the tests

---------

Co-authored-by: Syphax bouazzouni <[email protected]>
  • Loading branch information
imadbourouche and syphax-bouazzouni authored Mar 8, 2024
1 parent d37aeaf commit c9738fa
Show file tree
Hide file tree
Showing 11 changed files with 655 additions and 6 deletions.
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ RUN apt-get update -yqq && apt-get install -yqq --no-install-recommends \
openjdk-11-jre-headless \
raptor2-utils \
wait-for-it \
libraptor2-dev \
&& rm -rf /var/lib/apt/lists/*

RUN mkdir -p /srv/ontoportal/ontologies_linked_data
Expand Down
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,10 @@ gem 'rubyzip', '~> 1.0'
gem 'thin'
gem 'request_store'
gem 'jwt'
gem 'json-ld', '~> 3.0.2'
gem "parallel", "~> 1.24"


# Testing
group :test do
gem 'email_spec'
Expand Down
4 changes: 4 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ GEM
i18n (0.9.5)
concurrent-ruby (~> 1.0)
json (2.7.1)
json-ld (3.0.2)
multi_json (~> 1.12)
rdf (>= 2.2.8, < 4.0)
json_pure (2.7.1)
jwt (2.8.1)
base64
Expand Down Expand Up @@ -239,6 +242,7 @@ DEPENDENCIES
faraday (~> 1.9)
ffi
goo!
json-ld (~> 3.0.2)
jwt
libxml-ruby (~> 2.0)
minitest
Expand Down
3 changes: 3 additions & 0 deletions lib/ontologies_linked_data/media_types.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,11 @@ module MediaTypes
HTML = :html
JSON = :json
JSONP = :jsonp
JSONLD = :jsonld
XML = :xml
RDF_XML = :rdf_xml
TURTLE = :turtle
NTRIPLES = :ntriples
DEFAULT = JSON

def self.all
Expand Down
187 changes: 187 additions & 0 deletions lib/ontologies_linked_data/models/resource.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
require 'rdf/raptor'

module LinkedData
module Models

class Resource

def initialize(graph, id)
@id = id
@graph = graph
@hash = fetch_related_triples(graph, id)
end

def to_hash
@hash.dup
end

def to_object
hashes = self.to_hash
class_name = "GeneratedModel_#{Time.now.to_i}_#{rand(10000..99999)}"
model_schema = ::Class.new(LinkedData::Models::Base)
Object.const_set(class_name, model_schema)

model_schema.model(:resource, name_with: :id, rdf_type: lambda { |*_x| self.to_hash[Goo.namespaces[:rdf][:type].to_s] })
values_hash = {}
hashes.each do |predicate, value|
namespace, attr = namespace_predicate(predicate)
next if namespace.nil?

values = Array(value).map do |v|
if v.is_a?(Hash)
Struct.new(*v.keys.map { |k| namespace_predicate(k)[1].to_sym }.compact).new(*v.values)
else
v.is_a?(RDF::URI) ? v.to_s : v.object
end
end.compact

model_schema.attribute(attr.to_sym, property: namespace.to_s, enforce: get_type(value))
values_hash[attr.to_sym] = value.is_a?(Array) ? values : values.first
end

values_hash[:id] = hashes['id']
model_schema.new(values_hash)
end

def to_json
LinkedData::Serializers.serialize(to_hash, LinkedData::MediaTypes::JSONLD, namespaces)
end

def to_xml
LinkedData::Serializers.serialize(to_hash, LinkedData::MediaTypes::RDF_XML, namespaces)
end

def to_ntriples
LinkedData::Serializers.serialize(to_hash, LinkedData::MediaTypes::NTRIPLES, namespaces)
end

def to_turtle
LinkedData::Serializers.serialize(to_hash, LinkedData::MediaTypes::TURTLE, namespaces)
end

def namespaces
prefixes = {}
ns_count = 0
hash = to_hash
reverse = hash.delete('reverse')

hash.each do |key, value|
uris = [key]
uris += Array(value).map { |v| v.is_a?(Hash) ? v.to_a.flatten : v }.flatten
prefixes, ns_count = transform_to_prefixes(ns_count, prefixes, uris)
end

reverse.each { |key, uris| prefixes, ns_count = transform_to_prefixes(ns_count, prefixes, [key] + Array(uris)) }

prefixes
end

private

def transform_to_prefixes(ns_count, prefixes, uris)
uris.each do |uri|
namespace, id = namespace_predicate(uri)
next if namespace.nil? || prefixes.value?(namespace)

prefix, prefix_namespace = Goo.namespaces.select { |_k, v| v.to_s.eql?(namespace) }.first
if prefix
prefixes[prefix] = prefix_namespace.to_s
else
prefixes["ns#{ns_count}".to_sym] = namespace
ns_count += 1
end
end
[prefixes, ns_count]
end

def fetch_related_triples(graph, id)
direct_fetch_query = Goo.sparql_query_client.select(:predicate, :object)
.from(RDF::URI.new(graph))
.where([RDF::URI.new(id), :predicate, :object])

inverse_fetch_query = Goo.sparql_query_client.select(:subject, :predicate)
.from(RDF::URI.new(graph))
.where([:subject, :predicate, RDF::URI.new(id)])

hashes = { 'id' => RDF::URI.new(id) }

direct_fetch_query.each_solution do |solution|
predicate = solution[:predicate].to_s
value = solution[:object]

if value.is_a?(RDF::Node) && Array(hashes[predicate]).none? { |x| x.is_a?(Hash) }
value = fetch_b_nodes_triples(graph, id, solution[:predicate])
elsif value.is_a?(RDF::Node)
next
end

hashes[predicate] = hashes[predicate] ? (Array(hashes[predicate]) + Array(value)) : value
end

hashes['reverse'] = {}
inverse_fetch_query.each_solution do |solution|
subject = solution[:subject].to_s
predicate = solution[:predicate]

if hashes['reverse'][subject]
if hashes['reverse'][subject].is_a?(Array)
hashes['reverse'][subject] << predicate
else
hashes['reverse'][subject] = [predicate, hashes['reverse'][subject]]
end
else
hashes['reverse'][subject] = predicate
end

end

hashes
end

def fetch_b_nodes_triples(graph, id, predicate)
b_node_fetch_query = Goo.sparql_query_client.select(:b, :predicate, :object)
.from(RDF::URI.new(graph))
.where(
[RDF::URI.new(id), predicate, :b],
%i[b predicate object]
)

b_nodes_hash = {}
b_node_fetch_query.each_solution do |s|
b_node_id = s[:b].to_s
s[:predicate].to_s
s[:object]
if b_nodes_hash[b_node_id]
b_nodes_hash[b_node_id][s[:predicate].to_s] = s[:object]
else
b_nodes_hash[b_node_id] = { s[:predicate].to_s => s[:object] }
end
end
b_nodes_hash.values
end

def get_type(value)
types = []
types << :list if value.is_a?(Array)
value = Array(value).first
if value.is_a?(RDF::URI)
types << :uri
elsif value.is_a?(Float)
types << :float
elsif value.is_a?(Integer)
types << :integer
elsif value.to_s.eql?('true') || value.to_s.eql?('false')
types << :boolean
end
types
end

def namespace_predicate(property_url)
regex = /^(?<namespace>.*[\/#])(?<id>[^\/#]+)$/
match = regex.match(property_url.to_s)
[match[:namespace], match[:id]] if match
end

end
end
end
40 changes: 40 additions & 0 deletions lib/ontologies_linked_data/serializers/jsonld.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
require 'multi_json'
require 'json/ld'

module LinkedData
module Serializers
class JSONLD

def self.serialize(hashes, options = {})
subject = RDF::URI.new(hashes['id'])
reverse = hashes['reverse'] || {}
hashes.delete('id')
hashes.delete('reverse')
graph = RDF::Graph.new

hashes.each do |property_url, val|
Array(val).each do |v|
if v.is_a?(Hash)
blank_node = RDF::Node.new
v.each do |blank_predicate, blank_value|
graph << RDF::Statement.new(blank_node, RDF::URI.new(blank_predicate), blank_value)
end
v = blank_node
end
graph << RDF::Statement.new(subject, RDF::URI.new(property_url), v)
end
end

reverse.each do |reverse_subject, reverse_property|
Array(reverse_property).each do |s|
graph << RDF::Statement.new(RDF::URI.new(reverse_subject), RDF::URI.new(s), subject)
end
end

context = { '@context' => options.transform_keys(&:to_s) }
compacted = ::JSON::LD::API.compact(::JSON::LD::API.fromRdf(graph), context['@context'])
MultiJson.dump(compacted)
end
end
end
end
37 changes: 37 additions & 0 deletions lib/ontologies_linked_data/serializers/ntriples.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
module LinkedData
module Serializers
class NTRIPLES

def self.serialize(hashes, options = {})
subject = RDF::URI.new(hashes['id'])
reverse = hashes['reverse'] || {}
hashes.delete('id')
hashes.delete('reverse')
RDF::Writer.for(:ntriples).buffer(prefixes: options) do |writer|
hashes.each do |p, o|
predicate = RDF::URI.new(p)
Array(o).each do |item|
if item.is_a?(Hash)
blank_node = RDF::Node.new
item.each do |blank_predicate, blank_value|
writer << RDF::Statement.new(blank_node, RDF::URI.new(blank_predicate), blank_value)
end
item = blank_node
end
writer << RDF::Statement.new(subject, predicate, item)
end
end

reverse.each do |reverse_subject, reverse_property|
Array(reverse_property).each do |s|
writer << RDF::Statement.new(RDF::URI.new(reverse_subject), RDF::URI.new(s), subject)
end
end
end
end

end
end
end


43 changes: 43 additions & 0 deletions lib/ontologies_linked_data/serializers/rdf_xml.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
module LinkedData
module Serializers
class RDF_XML
def self.serialize(hashes, options = {})
subject = RDF::URI.new(hashes["id"])
reverse = hashes["reverse"] || {}
hashes.delete("id")
hashes.delete("reverse")
graph = RDF::Graph.new

hashes.each do |property_url, val|
Array(val).each do |v|
if v.is_a?(Hash)
blank_node = RDF::Node.new
v.each do |blank_predicate, blank_value|
graph << RDF::Statement.new(blank_node, RDF::URI.new(blank_predicate), blank_value)
end
v = blank_node
end
graph << RDF::Statement.new(subject, RDF::URI.new(property_url), v)
end
end

inverse_graph = RDF::Graph.new
reverse.each do |reverse_subject, reverse_property|
Array(reverse_property).each do |s|
inverse_graph << RDF::Statement.new(RDF::URI.new(reverse_subject), RDF::URI.new(s), subject)
end
end

a = RDF::RDFXML::Writer.buffer(prefixes: options) do |writer|
writer << graph
end

b = RDF::RDFXML::Writer.buffer(prefixes: options) do |writer|
writer << inverse_graph
end
xml_result = "#{a.chomp("</rdf:RDF>\n")}\n#{b.sub!(/^<\?xml[^>]*>\n<rdf:RDF[^>]*>/, '').gsub(/^$\n/, '')}"
xml_result.gsub(/^$\n/, '')
end
end
end
end
14 changes: 8 additions & 6 deletions lib/ontologies_linked_data/serializers/serializers.rb
Original file line number Diff line number Diff line change
@@ -1,26 +1,28 @@
require 'ontologies_linked_data/media_types'
require 'ontologies_linked_data/serializers/xml'
require 'ontologies_linked_data/serializers/rdf_xml'
require 'ontologies_linked_data/serializers/json'
require 'ontologies_linked_data/serializers/jsonp'
require 'ontologies_linked_data/serializers/jsonld'
require 'ontologies_linked_data/serializers/html'
require 'ontologies_linked_data/serializers/ntriples'
require 'ontologies_linked_data/serializers/turtle'

module LinkedData
module Serializers
def self.serialize(obj, type, options = {})
SERIALIZERS[type].serialize(obj, options)
end

class Turtle
def self.serialize(obj, options)
end
end

SERIALIZERS = {
LinkedData::MediaTypes::HTML => HTML,
LinkedData::MediaTypes::JSON => JSON,
LinkedData::MediaTypes::JSONP => JSONP,
LinkedData::MediaTypes::JSONLD => JSONLD,
LinkedData::MediaTypes::XML => XML,
LinkedData::MediaTypes::TURTLE => JSON
LinkedData::MediaTypes::RDF_XML => RDF_XML,
LinkedData::MediaTypes::TURTLE => TURTLE,
LinkedData::MediaTypes::NTRIPLES => NTRIPLES
}
end
end
Loading

0 comments on commit c9738fa

Please sign in to comment.