public interface RDFParser
org.apache.commons.rdf.api
package when it has stabilized.
This interface follows the
Builder pattern,
allowing to set parser settings like contentType(RDFSyntax)
and
base(IRI)
. A caller MUST call one of the source
methods
(e.g. source(IRI)
, source(Path)
,
source(InputStream)
), and MUST call one of the target
methods (e.g. target(Consumer)
, target(Dataset)
,
target(Graph)
) before calling parse()
on the returned
RDFParser - however methods can be called in any order.
The call to parse()
returns a Future
, allowing asynchronous
parse operations. Callers are recommended to check Future.get()
to
ensure parsing completed successfully, or catch exceptions thrown during
parsing.
Setting a method that has already been set will override any existing value
in the returned builder - regardless of the parameter type (e.g.
source(IRI)
will override a previous source(Path)
. Settings
can be unset by passing null
- note that this may require
casting, e.g. contentType( (RDFSyntax) null )
to undo a previous
call to contentType(RDFSyntax)
.
It is undefined if a RDFParser is mutable or thread-safe, so callers should
always use the returned modified RDFParser from the builder methods. The
builder may return itself after modification, or a cloned builder with the
modified settings applied. Implementations are however encouraged to be
immutable, thread-safe and document this. As an example starting point, see
org.apache.commons.rdf.simple.AbstractRDFParser
.
Example usage:
Graph g1 = rDFTermFactory.createGraph(); new ExampleRDFParserBuilder().source(Paths.get("/tmp/graph.ttl")).contentType(RDFSyntax.TURTLE).target(g1).parse() .get(30, TimeUnit.Seconds);
Modifier and Type | Interface and Description |
---|---|
static interface |
RDFParser.ParseResult
The result of
parse() indicating parsing completed. |
Modifier and Type | Method and Description |
---|---|
RDFParser |
base(IRI base)
Specify a base IRI to use for parsing any relative IRI references.
|
RDFParser |
base(String base)
Specify a base IRI to use for parsing any relative IRI references.
|
RDFParser |
contentType(RDFSyntax rdfSyntax)
Specify the content type of the RDF syntax to parse.
|
RDFParser |
contentType(String contentType)
Specify the content type of the RDF syntax to parse.
|
Future<? extends RDFParser.ParseResult> |
parse()
Parse the specified source.
|
RDFParser |
rdfTermFactory(RDF rdfTermFactory)
|
RDFParser |
source(InputStream inputStream)
Specify a source
InputStream to parse. |
RDFParser |
source(IRI iri)
Specify an absolute source
IRI to retrieve and parse. |
RDFParser |
source(Path file)
Specify a source file
Path to parse. |
RDFParser |
source(String iri)
Specify an absolute source IRI to retrieve and parse.
|
RDFParser |
target(Consumer<Quad> consumer)
Specify a consumer for parsed quads.
|
default RDFParser |
target(Dataset dataset)
Specify a
Dataset to add parsed quads to. |
default RDFParser |
target(Graph graph)
Specify a
Graph to add parsed triples to. |
RDFParser rdfTermFactory(RDF rdfTermFactory)
RDF
to use for generating RDFTerm
s.
This option may be used together with target(Graph)
to override
the implementation's default factory and graph.
Warning: Using the same RDF
for multiple
parse()
calls may accidentally merge BlankNode
s having
the same label, as the parser may use the
RDF.createBlankNode(String)
method from the parsed blank node
labels.
rdfTermFactory
- RDF
to use for generating RDFTerms.RDFParser
that will use the specified rdfTermFactorytarget(Graph)
RDFParser contentType(RDFSyntax rdfSyntax) throws IllegalArgumentException
This option can be used to select the RDFSyntax of the source, overriding
any Content-Type
headers or equivalent.
The character set of the RDFSyntax is assumed to be
StandardCharsets.UTF_8
unless overridden within the document
(e.g. <?xml version="1.0" encoding="iso-8859-1"?>
in
RDFSyntax.RDFXML
).
This method will override any contentType set with
contentType(String)
.
rdfSyntax
- An RDFSyntax
to parse the source according to, e.g.
RDFSyntax.TURTLE
.RDFParser
that will use the specified content type.IllegalArgumentException
- If this RDFParser does not support the specified RDFSyntax.contentType(String)
RDFParser contentType(String contentType) throws IllegalArgumentException
This option can be used to select the RDFSyntax of the source, overriding
any Content-Type
headers or equivalent.
The content type MAY include a charset
parameter if the RDF
media types permit it; the default charset is
StandardCharsets.UTF_8
unless overridden within the document.
This method will override any contentType set with
contentType(RDFSyntax)
.
contentType
- A content-type string, e.g. application/ld+json
or text/turtle;charset="UTF-8"
as specified by
RFC7231.RDFParser
that will use the specified content type.IllegalArgumentException
- If the contentType has an invalid syntax, or this RDFParser
does not support the specified contentType.contentType(RDFSyntax)
default RDFParser target(Graph graph)
Graph
to add parsed triples to.
If the source supports datasets (e.g. the contentType(RDFSyntax)
set has RDFSyntax.supportsDataset()
is true)), then only quads in
the default graph will be added to the Graph as Triple
s.
It is undefined if any triples are added to the specified Graph
if parse()
throws any exceptions. (However implementations are
free to prevent this using transaction mechanisms or similar). If
Future.get()
does not indicate an exception, the parser
implementation SHOULD have inserted all parsed triples to the specified
graph.
Calling this method will override any earlier targets set with
target(Graph)
, target(Consumer)
or
target(Dataset)
.
The default implementation of this method calls target(Consumer)
with a Consumer
that does Graph.add(Triple)
with
Quad.asTriple()
if the quad is in the default graph.
default RDFParser target(Dataset dataset)
Dataset
to add parsed quads to.
It is undefined if any quads are added to the specified Dataset
if parse()
throws any exceptions. (However implementations are
free to prevent this using transaction mechanisms or similar). On the
other hand, if parse()
does not indicate an exception, the
implementation SHOULD have inserted all parsed quads to the specified
dataset.
Calling this method will override any earlier targets set with
target(Graph)
, target(Consumer)
or
target(Dataset)
.
The default implementation of this method calls target(Consumer)
with a Consumer
that does Dataset.add(Quad)
.
RDFParser target(Consumer<Quad> consumer)
The quads will include triples in all named graphs of the parsed source,
including any triples in the default graph. When parsing a source format
which do not support datasets, all quads delivered to the consumer will
be in the default graph (e.g. their Quad.getGraphName()
will be
as Optional.empty()
), while for a source
It is undefined if any quads are consumed if parse()
throws any
exceptions. On the other hand, if parse()
does not indicate an
exception, the implementation SHOULD have produced all parsed quads to
the specified consumer.
Calling this method will override any earlier targets set with
target(Graph)
, target(Consumer)
or
target(Dataset)
.
The consumer is not assumed to be thread safe - only one
Consumer.accept(Object)
is delivered at a time for a given
parse()
call.
This method is typically called with a functional consumer, for example:
List<Quad> quads = new ArrayList<Quad>;
parserBuilder.target(quads::add).parse();
RDFParser base(IRI base)
Setting this option will override any protocol-specific base IRI (e.g.
Content-Location
header) or the source(IRI)
IRI,
but does not override any base IRIs set within the source document (e.g.
@base
in Turtle documents).
If the source is in a syntax that does not support relative IRI
references (e.g. RDFSyntax.NTRIPLES
), setting the
base
has no effect.
This method will override any base IRI set with base(String)
.
base
- An absolute IRI to use as a base.RDFParser
that will use the specified base IRI.base(String)
RDFParser base(String base) throws IllegalArgumentException
Setting this option will override any protocol-specific base IRI (e.g.
Content-Location
header) or the source(IRI)
IRI,
but does not override any base IRIs set within the source document (e.g.
@base
in Turtle documents).
If the source is in a syntax that does not support relative IRI
references (e.g. RDFSyntax.NTRIPLES
), setting the
base
has no effect.
This method will override any base IRI set with base(IRI)
.
base
- An absolute IRI to use as a base.RDFParser
that will use the specified base IRI.IllegalArgumentException
- If the base is not a valid absolute IRI stringbase(IRI)
RDFParser source(InputStream inputStream)
InputStream
to parse.
The source set will not be read before the call to parse()
.
The InputStream will not be closed after parsing. The InputStream does
not need to support InputStream.markSupported()
.
The parser might not consume the complete stream (e.g. an RDF/XML parser
may not read beyond the closing tag of
</rdf:Description>
).
The contentType(RDFSyntax)
or contentType(String)
SHOULD be set before calling parse()
.
The character set is assumed to be StandardCharsets.UTF_8
unless
the contentType(String)
specifies otherwise or the document
declares its own charset (e.g. RDF/XML with a
<?xml encoding="iso-8859-1">
header).
The base(IRI)
or base(String)
MUST be set before
calling parse()
, unless the RDF syntax does not permit relative
IRIs (e.g. RDFSyntax.NTRIPLES
).
This method will override any source set with source(IRI)
,
source(Path)
or source(String)
.
inputStream
- An InputStream to consumeRDFParser
that will use the specified source.RDFParser source(Path file)
Path
to parse.
The source set will not be read before the call to parse()
.
The contentType(RDFSyntax)
or contentType(String)
SHOULD be set before calling parse()
.
The character set is assumed to be StandardCharsets.UTF_8
unless
the contentType(String)
specifies otherwise or the document
declares its own charset (e.g. RDF/XML with a
<?xml encoding="iso-8859-1">
header).
The base(IRI)
or base(String)
MAY be set before calling
parse()
, otherwise Path.toUri()
will be used as the base
IRI.
This method will override any source set with source(IRI)
,
source(InputStream)
or source(String)
.
file
- A Path for a file to parseRDFParser
that will use the specified source.RDFParser source(IRI iri)
IRI
to retrieve and parse.
The source set will not be read before the call to parse()
.
If this builder does not support the given IRI protocol (e.g.
urn:uuid:ce667463-c5ab-4c23-9b64-701d055c4890
), this method
should succeed, while the parse()
should throw an
IOException
.
The contentType(RDFSyntax)
or contentType(String)
MAY
be set before calling parse()
, in which case that type MAY be
used for content negotiation (e.g. Accept
header in HTTP),
and SHOULD be used for selecting the RDFSyntax.
The character set is assumed to be StandardCharsets.UTF_8
unless
the protocol's equivalent of Content-Type
specifies
otherwise or the document declares its own charset (e.g. RDF/XML with a
<?xml encoding="iso-8859-1">
header).
The base(IRI)
or base(String)
MAY be set before calling
parse()
, otherwise the source IRI will be used as the base IRI.
This method will override any source set with source(Path)
,
source(InputStream)
or source(String)
.
iri
- An IRI to retrieve and parseRDFParser
that will use the specified source.RDFParser source(String iri) throws IllegalArgumentException
The source set will not be read before the call to parse()
.
If this builder does not support the given IRI (e.g.
urn:uuid:ce667463-c5ab-4c23-9b64-701d055c4890
), this method
should succeed, while the parse()
should throw an
IOException
.
The contentType(RDFSyntax)
or contentType(String)
MAY
be set before calling parse()
, in which case that type MAY be
used for content negotiation (e.g. Accept
header in HTTP),
and SHOULD be used for selecting the RDFSyntax.
The character set is assumed to be StandardCharsets.UTF_8
unless
the protocol's equivalent of Content-Type
specifies
otherwise or the document declares its own charset (e.g. RDF/XML with a
<?xml encoding="iso-8859-1">
header).
The base(IRI)
or base(String)
MAY be set before calling
parse()
, otherwise the source IRI will be used as the base IRI.
This method will override any source set with source(Path)
,
source(InputStream)
or source(IRI)
.
iri
- An IRI to retrieve and parseRDFParser
that will use the specified source.IllegalArgumentException
- If the base is not a valid absolute IRI stringFuture<? extends RDFParser.ParseResult> parse() throws IOException, IllegalStateException
A source method (e.g. source(InputStream)
, source(IRI)
,
source(Path)
, source(String)
or an equivalent subclass
method) MUST have been called before calling this method, otherwise an
IllegalStateException
will be thrown.
A target method (e.g. target(Consumer)
,
target(Dataset)
, target(Graph)
or an equivalent
subclass method) MUST have been called before calling parse(), otherwise
an IllegalStateException
will be thrown.
It is undefined if this method is thread-safe, however the
RDFParser
may be reused (e.g. setting a different source) as soon
as the Future
has been returned from this method.
The RDFParser SHOULD perform the parsing as an asynchronous operation,
and return the Future
as soon as preliminary checks (such as
validity of the source(IRI)
and contentType(RDFSyntax)
settings) have finished. The future SHOULD not mark
Future.isDone()
before parsing is complete. A synchronous
implementation MAY be blocking on the parse()
call and
return a Future that is already Future.isDone()
.
The returned Future
contains a RDFParser.ParseResult
.
Implementations may subclass this interface to provide any parser
details, e.g. list of warnings. null
is a possible return
value if no details are available, but parsing succeeded.
If an exception occurs during parsing, (e.g. IOException
or
org.apache.commons.rdf.simple.experimental.RDFParseException
),
it should be indicated as the
Throwable.getCause()
in the
ExecutionException
thrown on
Future.get()
.
Graph
when the
parsing has finished.IOException
- If an error occurred while starting to read the source (e.g.
file not found, unsupported IRI protocol). Note that IO
errors during parsing would instead be the
Throwable.getCause()
of
the ExecutionException
thrown on
Future.get()
.IllegalStateException
- If the builder is in an invalid state, e.g. a
source
has not been set.Copyright © 2015–2018 The Apache Software Foundation. All rights reserved.