Skip to content

Commit

Permalink
Merge pull request #4 from jlareck/images-documentation
Browse files Browse the repository at this point in the history
Add documentation to the images dataset
  • Loading branch information
JJ-Author authored Feb 14, 2022
2 parents 9ae7bf4 + 3fdd1ad commit a1f234d
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
33 changes: 33 additions & 0 deletions databus-poms/dbpedia/generic/images/images.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Images dataset
contains URLs and basic metadata about images included in Wikipedia articles

The dataset uses several properties to specify the relation between image entities and DBpedia entities.
`dbo:thumbnail` connects the first image from the wikipage as thumbnail (donwload url width parameter set to 300 pixels) to the DBpedia entity.
In contrast, `foaf:depiction` connects the URLs of all images (no information on the order of appearance is given) in full size to a DBpedia entity.

However, `foaf:thumbnail` is used to connect the thumbnail URL to the image entity. The subject of the image entity is the download / export API link and uses
`rdf:type`, `dc:rights` for basic metadata.


Examples of triples:

```
#DBpedia entity
<http://dbpedia.org/resource/Joe_Biden> <http://xmlns.com/foaf/0.1/depiction> <http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg> .
<http://dbpedia.org/resource/Joe_Biden> <http://dbpedia.org/ontology/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg?width=300> .
# Image entities
<http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg> <http://xmlns.com/foaf/0.1/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg?width=300> .
<http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg> <http://purl.org/dc/elements/1.1/rights> <http://en.wikipedia.org/wiki/File:Joe_Biden_presidential_portrait.jpg> .
# Thumbnail entities
<http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg?width=300> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/Joe_Biden_presidential_portrait.jpg?width=300> <http://purl.org/dc/elements/1.1/rights> <http://en.wikipedia.org/wiki/File:Joe_Biden_presidential_portrait.jpg> .
```


### Changelog

* [2021.12.01] Fixed producing image triples from pages that don't contain images. Fixed extracting images that are not related to the wikipages (Commits: [5b984fc7](https://github.com/dbpedia/extraction-framework/commit/5b984fc7d9f61822a9b017b85f6b2546c15e1370), [5b984fc](https://github.com/dbpedia/extraction-framework/commit/5b984fc7d9f61822a9b017b85f6b2546c15e1370), [ebc6c61](https://github.com/dbpedia/extraction-framework/commit/ebc6c6184679cfdea0c1648bae593a580dd8d7cc)). Related issue: https://github.com/dbpedia/extraction-framework/issues/720
19 changes: 19 additions & 0 deletions databus-poms/dbpedia/generic/images/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<groupId>generic</groupId>
<artifactId>group-metadata</artifactId>
<version>2016.10.01</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<groupId>generic</groupId>
<artifactId>images</artifactId>
<packaging>jar</packaging>

<properties>
<databus.codeReference>https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/mappings/ImageExtractorNew.scala</databus.codeReference>
</properties>

</project>

0 comments on commit a1f234d

Please sign in to comment.