DataBus Archives - DBpedia Association https://www.dbpedia.org/blog/tag/databus/ Global and Unified Access to Knowledge Graphs Mon, 12 Feb 2024 16:50:39 +0000 en-GB hourly 1 https://wordpress.org/?v=6.4.3 https://www.dbpedia.org/wp-content/uploads/2020/09/cropped-dbpedia-webicon-32x32.png DataBus Archives - DBpedia Association https://www.dbpedia.org/blog/tag/databus/ 32 32 A year with DBpedia – Retrospective Part 2/2023 https://www.dbpedia.org/blog/a-year-with-dbpedia-retrospective-part-2-2023/ Thu, 04 Jan 2024 13:45:24 +0000 https://www.dbpedia.org/?p=5672 This is the final part of our journey through 2023. In the previous blog post we have presented the DBpedia highlights. Now we will take a look at the second half of 2023 and give an outlook for 2024. Tutorial @  Language, Data and Knowledge conference On 13th of September, 2023, an exciting tutorial took […]

The post A year with DBpedia – Retrospective Part 2/2023 appeared first on DBpedia Association.

]]>
This is the final part of our journey through 2023. In the previous blog post we have presented the DBpedia highlights. Now we will take a look at the second half of 2023 and give an outlook for 2024.

Tutorial @  Language, Data and Knowledge conference

On 13th of September, 2023, an exciting tutorial took place at the University of Vienna in the Center for Translation Studies as part of the LDK 2023. The LDK conference focuses on the acquisition, maintenance and use of language data in the context of data science and knowledge-based applications. The tutorial was opened by Milan Dojchinovski (InfAI, DBpedia Association, CTU in Prague). This was followed by three sessions, which were accompanied by many real-world practical use cases, on the DBpedia Knowledge Graph, the infrastructure and the use of the databus data publishing platform. Check more details on our events page

DBpedia Day @ SEMANTiCS in Leipzig 

DBpedia Day was once again part of the program at this year’s SEMANTICS conference 2023. It was held on 20th of September at the HYPERION Hotel Leipzig with up to 100 DBpedians. Once again this year, our CEO Sebastian Hellmann opened the day with a presentation of the “DBpedia Databus version 2.1.0”. This was followed by the exciting keynote speech “Towards Foundation Models for Data Spaces” by Edward Curry from the University of Galway, Ireland. Afterwards, we organized the member session and the DBpedia Science Talk session. All slides can also be found on our  events page.

Databus

Databus pre-launch announcement

We are in the final stage of the DBpedia Databus open software release (GitHub). Remaining issues include quality of life and UI improvements. Check out the Databus feature matrix for our lightweight, scalable, adaptable, powerful Data Catalog Platform (direct download link, persistent data identifier on the databus). Contact dbpedia@infai.org for demo, business, or research proposal inquiries.

Databus excels at cataloging de-central data of any filetype using RDF/DCAT. We selected a few initial focal use cases, where the Databus serves as:

  1. AIModelHub for AI training data, models, validation, and deployment.
  2. Research Data Management Catalog for research institutes and communities.
  3. Supply-Chain-Management Platform for product information collection along the supply chain and construction of Digital Product Passports.
  4. Community Data Portal, e.g., for the DBpedia Community.

DBpedia Contributions will be enabled soon, taking DBpedia to the moon! 🚀

In DBpedia’s future, the Databus will be used to collect community contributions more effectively, giving DBpedia an enormous boost in quantity and quality. https://databus.dbpedia.org already catalogs over 350k files with over 1 Million file downloads per month!  We are preparing showcases, templates, and documentation for these community contribution types:

  1. Community Extensions such as caligraph.org or AI-improved abstracts.
  2. Community Link Contributions for inclusion in the main graph.
  3. RDF profiles for DBpedia Users and Members (FOAF, Schema.org, WebID) via Databus Accounts (including publication of expertise).
  4. Dockerized RDF Tool Deployment so you can automatically load DBpedia and other RDF data into your favorite RDF tools via Databus collections. Our Databus-powered Virtuoso SPARQL Endpoint Quickstart Docker has already been deployed over 150k times! 

We do hope we will meet you and some new faces during our events next year. The association wants to get to know you because DBpedia is a community effort and would not continue to develop, improve and grow without you. We plan to have a tutorial at the LREC-COLING 2024 conference and a meeting at SEMANTiCS, Sep 17-19, 2024, conference in Amsterdam, Netherlands.

Stay safe and check Twitter, Instagram and LinkedIn or or subscribe to our Newsletter for the latest news and information.

Yours,

Julia & Maria

on behalf of the DBpedia Association

The post A year with DBpedia – Retrospective Part 2/2023 appeared first on DBpedia Association.

]]>
DBpedia Snapshot 2022-12 Release https://www.dbpedia.org/blog/dbpedia-snapshot-2022-12-release/ Mon, 27 Mar 2023 09:36:32 +0000 https://www.dbpedia.org/?p=5585 We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset.  News since DBpedia Snapshot 2022-09 Work in progress: Smoothing the community issue reporting and fixing at Github What is the “DBpedia Snapshot” Release? […]

The post DBpedia Snapshot 2022-12 Release appeared first on DBpedia Association.

]]>
We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset. 

News since DBpedia Snapshot 2022-09

  • New Abstract Extractor due to GSOC 2022 (credits to Celian Ringwald) 

Work in progress: Smoothing the community issue reporting and fixing at Github

What is the “DBpedia Snapshot” Release?

Historically, this release has been associated with many names: “DBpedia Core”, “EN DBpedia”, and — most confusingly — just “DBpedia”. In fact, it is a combination of —

  • EN Wikipedia data — A small, but very useful, subset (~ 1 Billion triples or 14%) of the whole DBpedia extraction using the DBpedia Information Extraction Framework (DIEF), comprising structured information extracted from the English Wikipedia plus some enrichments from other Wikipedia language editions, notably multilingual abstracts in ar, ca, cs, de, el, eo, es, eu, fr, ga, id, it, ja, ko, nl, pl, pt, sv, uk, ru, zh.
  • Links — 62 million community-contributed cross-references and owl:sameAs links to other linked data sets on the Linked Open Data (LOD) Cloud that allow to effectively find and retrieve further information from the largest,  decentral, change-sensitive knowledge graph on earth that has formed around DBpedia since 2007. 
  • Community extensions — Community-contributed extensions such as additional ontologies and taxonomies. 

Release Frequency & Schedule

Going forward, releases will be scheduled for the 1th of February, May, August, and November (with +/- 5 days tolerance), and are named using the same date convention as the Wikipedia Dumps that served as the basis for the release. An example of the release timeline is shown below: 

December 6–8 December 8–20Dec 20–Jan 1Jan 1–Feb 15
Wikipedia dumps for June 1 become available on https://dumps.wikimedia.org/Download and extraction with DIEFPost-processing and quality-control periodLinked Data and SPARQL endpoint deployment 

Data Freshness

Given the timeline above, the EN Wikipedia data of DBpedia Snapshot has a lag of 1-4 months. We recommend the following strategies to mitigate this:

  1. DBpedia Snapshot as a kernel for Linked Data: Following the Linked Data paradigm, we recommend using the Linked Data links to other knowledge graphs to retrieve high-quality and recent information. DBpedia’s network consists of the best knowledge engineers in the world, working together, using linked data principles to build a high-quality, open, decentralized knowledge graph network around DBpedia. Freshness and change-sensitivity are two of the greatest data-related challenges of our time, and can only be overcome by linking data across data sources. The “Big Data” approach of copying data into a central warehouse is inevitably challenged by issues such as co-evolution and scalability. 
  2. DBpedia Live: Wikipedia is unmistakenly the richest, most recent body of human knowledge and source of news in the world. DBpedia Live is just minutes behind edits on Wikipedia,  which means that as soon as any of the 120k Wikipedia editors press the “save” button, DBpedia Live will extract fresh data and update.  DBpedia Live consists of the DBpedia Live Sync API (for syncing into any kind of on-site databases), Linked Data and SPARQL endpoint.
  3. Latest-Core is a dynamically updating Databus Collection. Our automated extraction robot “MARVIN” publishes monthly dev versions of the full extraction, which are then refined and enriched to become Snapshot.      

Data Quality & Richness

We would like to acknowledge the excellent work of Wikipedia editors (~46k active editors for EN Wikipedia), who are ultimately responsible for collecting information in Wikipedia’s infoboxes, which are refined by DBpedia’s extraction into our knowledge graphs. Wikipedia’s infoboxes are steadily growing each month and according to our measurements grow by 150% every three years. EN Wikipedia’s inboxes even doubled in this timeframe. This richness of knowledge drives the DBpedia Snapshot knowledge graph and is further potentiated by synergies with linked data cross-references. Statistics are given below

Data Access & Interaction Options

Linked Data

Linked Data is a principled approach to publishing RDF data on the Web that enables interlinking data between different data sources, courtesy of the built-in power of Hyperlinks as unique Entity Identifiers.


HTML pages comprising Hyperlinks that confirm to Linked Data Principles is one of the methods of interacting with data provided by the DBpedia Snapshot, be it manually via the web browser or programmatically using REST interaction patterns via https://dbpedia.org/resource/{entity-label} pattern. Naturally, we encourage Linked Data interactions, while also expecting user-agents to honor the cache-control HTTP response header for massive crawl operations. Instructions for accessing Linked Data, available in 10 formats.

SPARQL Endpoint

This service enables some astonishing queries against Knowledge Graphs derived from Wikipedia content. The Query Services Endpoint that makes this possible is identified by http://dbpedia.org/sparql, and it currently handles 7.2 million queries daily on averageSee powerful queries and instructions (incl. rates and limitations).

An effective Usage Pattern is to filter a relevant subset of entity descriptions for your use case via SPARQL and then combine with the power of Linked Data by looking up (or de-referencing) data via owl:sameAs property links en route to retrieving specific and recent data from across other Knowledge Graphs across the massive Linked Open Data Cloud.

Additionally, DBpedia Snapshot dumps and additional data from the complete collection of datasets derived from Wikipedia are provided by the DBpedia Databus for use in your own SPARQL-accessible Knowledge Graphs.

DBpedia Ontology

This Snapshot Release was built with DBpedia Ontology (DBO) version: https://databus.dbpedia.org/ontologies/dbpedia.org/ontology–DEV/2021.11.08-124002 We thank all DBpedians for the contribution to the ontology and the mappings. See documentation and visualizations, class tree and properties, wiki.

DBpedia Snapshot Statistics

Overview. Overall the current Snapshot Release contains more than 850 million facts (triples).

At its core, the DBpedia ontology is the heart of DBpedia. Our community is continuously contributing to the DBpedia ontology schema and the DBpedia infobox-to-ontology mappings by actively using the DBpedia Mappings Wiki.

The current Snapshot Release utilizes a total of 55 thousand properties, whereas 1377 of these are defined by the DBpedia ontology.

Classes. Knowledge in Wikipedia is constantly growing at a rapid pace. We use the DBpedia Ontology Classes to measure the growth: Total number in this release (in brackets we give: a) growth to the previous release, which can be negative temporarily and b) growth compared to Snapshot 2016-10): 

  • Persons: 1792308 (1.01%, 1.13%)
  • Places: 748372 (1.00%, 1820.86%), including but not limited to 590481 (1.00%, 5518.51%) populated places
  • Works 610589 (1.00%, 619.89%), including, but not limited to
    • 157566 (1.00%, 1.38%) music albums
    • 144415 (1.01%, 15.94%) films
    • 24829 (1.01%, 12.53%) video games
  • Organizations: 345523 (1.01%, 109.31%), including but not limited to
    • 87621 (1.01%, 2.25%) companies
    • 64507 (1.00%, 64507.00%) educational institutions
  • Species: 1933436 (1.01%, 322239.33%)
  • Plants: 7718 (0.82%, 1.71%)
  • Diseases: 10591 (1.00%, 8.54%)

Detailed Growth of Classes: The image below shows the detailed growth for one class. Click on the links for other classes: Place, PopulatedPlace, Work, Album, Film, VideoGame, Organisation, Company, EducationalInstitution, Species, Plant, Disease. For further classes adapt the query by replacing the <http://dbpedia.org/ontology/CLASS> URI. Note, that 2018 was a development phase with some failed extractions. The stats were generated with the Databus VOID Mod.

Links. Linked Data cross-references between decentral datasets are the foundation and access point to the Linked Data Web. The latest Snapshot Release provides over 130.6 million links from 7.62 million entities to 179 external sources.

Top 11

###TOP11###

33,975305 http://www.wikidata.org 

  7,206,254 https://global.dbpedia.org 

  4,308,772 http://yago-knowledge.org 

  3,855,108 http://de.dbpedia.org 

  3,731,002 http://fr.dbpedia.org 

  2,991,921 http://viaf.org 

  2,929,808 http://it.dbpedia.org 

  2,925,530 http://es.dbpedia.org 

  2,788,703 http://fa.dbpedia.org 

  2,587,004 http://ru.dbpedia.org 

  2,580,398 http://sr.dbpedia.org 

Top 10 without DBpedia namespaces

###TOP10###

33,975,305 http://www.wikidata.org 

  4,308,772 http://yago-knowledge.org 

  2,991,921 http://viaf.org

  1,708,533 http://d-nb.info 

     612,227 http://sws.geonames.org 

     596,134 http://umbel.org 

     537,602 http://data.bibliotheken.nl 

     430,839 http://www.w3.org 

     422,989 http://musicbrainz.org 

     104,433 http://linkedgeodata.org 

DBpedia Extraction Dumps on the Databus

All extracted files are reachable via the DBpedia account on the Databus. The Databus has two main structures:

Snapshot Download. For downloading DBpedia Snapshot, we prepared this collection, which also includes detailed releases notes: 

https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-03

The collection is roughly equivalent to http://downloads.dbpedia.org/2016-10/core/

Collections can be downloaded in many different ways, some download modalities such as bash script, SPARQL, and plain URL list are found in the tabs at the collection. Files are provided as bzip2 compressed n-triples files. In case you need a different format or compression, you can also use the “Download-As” function of the Databus Client (GitHub), e.g. -s $collection -c gzip would download the collection and convert it to GZIP during download. 

Replicating DBpedia Snapshot on your server can be done via Docker, see https://hub.docker.com/r/dbpedia/virtuoso-sparql-endpoint-quickstart 

git clone https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart.git

cd virtuoso-sparql-endpoint-quickstart

COLLECTION_URI=https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-09 VIRTUOSO_ADMIN_PASSWD=password docker-compose up

Download files from the whole DBpedia extraction. The whole extraction consists of approx. 20 Billion triples and 5000 files created from 140 languages of Wikipedia, Commons  and Wikidata. They can be found in https://databus.dbpedia.org/dbpedia/(generic|mappings|text|wikidata

You can copy-edit a collection and create your own customized (e.g.) collections via “Actions” -> “Copy Edit” , e.g. you can Copy Edit the snapshot collection above, remove some files that you do not need and add files from other languages. Please see the Rhizomer use case: Best way to download specific parts of DBpedia. Of course, this only refers to the archived dumps on the Databus for users who want to bulk download and deploy into their own infrastructure. Linked Data and SPARQL allow for filtering the content using a small data pattern.  

Acknowledgments

First and foremost, we would like to thank our open community of knowledge engineers for finding & fixing bugs and for supporting us by writing data tests. We would also like to acknowledge the DBpedia Association members for constantly innovating the areas of knowledge graphs and linked data and pushing the DBpedia initiative with their know-how and advice. OpenLink Software supports DBpedia by hosting SPARQL and Linked Data; University Mannheim, the German National Library of Science and Technology (TIB) and the Computer Center of University Leipzig provide persistent backups and servers for extracting data. We thank Marvin Hofer and Mykola Medynskyi for technical preparation. This work was partially supported by grants from the Federal Ministry for Economics and Climate Action (BMWK) for the LOD-GEOSS Project (03EI1005E), PenFLaaS (100594042) as well as for the PLASS Project (01MD19003D).

The post DBpedia Snapshot 2022-12 Release appeared first on DBpedia Association.

]]>
GSoC2022 – Call for Contributors https://www.dbpedia.org/blog/gsoc2022/ Fri, 11 Mar 2022 11:49:46 +0000 https://www.dbpedia.org/?p=5216 Pinky: Gee, Brain, what are we gonna do this year?Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2022. For the 11th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to […]

The post GSoC2022 – Call for Contributors appeared first on DBpedia Association.

]]>
Pinky: Gee, Brain, what are we gonna do this year?
Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2022.

For the 11th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to work with open-source organizations like DBpedia

So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Even though Covid-19 changed a lot in the world, it couldn’t shake Google Summer of Code (GSoC) much. The program, designed to mentor youngsters from afar is almost too perfect for us. One of the advantages of GSoC is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us.

DBpedia is now looking for contributors who want to work with us during the upcoming summer months.  

What is Google Summer of Code?

Google Summer of Code is a global program focused on bringing developers into open source software development. Funds will be given to all new beginner contributors to open source over 18 years to work for two and a half months (or longer) on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2022.

And this is how it works …

Step 1Check out one of our projects here or draft your own. 
Step 2Get in touch with our mentors as soon as possible and write up a project proposal of at least 8 pages. Information about our proposal structure and a template are available here.  
Step 3After a selection phase, contributors are matched with a specific project and mentor(s) and start working on the project. 

Application Procedure

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2022. Please also note the official GSoC 2022 timeline for your proposal submission and make sure to submit on time. Unfortunately, extensions cannot be granted. Final submission deadline is April 19, 2022 at 18:00 UTC.

Contact

Detailed information on how to apply are available on the DBpedia website. We’ve prepared an information kit for you. Please find all necessary information regarding the student application procedure here.

And in case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Finally, we are looking forward to your contribution!

Yours DBpedia Association

The post GSoC2022 – Call for Contributors appeared first on DBpedia Association.

]]>
A year with DBpedia – Retrospective Part 2/2021 https://www.dbpedia.org/blog/a-year-with-dbpedia-retrospective-part-2/ Thu, 06 Jan 2022 10:40:18 +0000 https://www.dbpedia.org/?p=5110 This is the final part of our journey through 2021. In the previous blog post we already presented DBpedia highlights, events and tutorials. Now we want to take a look at the second half of 2021 and give an outlook for 2022. LSWT 2021 We kicked-off the summer with an online tutorial at the Leipziger […]

The post A year with DBpedia – Retrospective Part 2/2021 appeared first on DBpedia Association.

]]>
This is the final part of our journey through 2021. In the previous blog post we already presented DBpedia highlights, events and tutorials. Now we want to take a look at the second half of 2021 and give an outlook for 2022.

LSWT 2021

We kicked-off the summer with an online tutorial at the Leipziger Semantic Web Day (LSWT). For the first time ever the LSWT team extended the program and organized a second conference day for DBpedia enthusiasts. Many thanks to the hosts and organizing team! It was a pleasure to be part of the LSWT again.If you were unable to take part in the tutorial, please check our slides here or watch the video on the DBpedia Youtube channel.   

DBpedia Snapshot 2021-06 Release

On 23rd of July, 2021 we announced the DBpedia Snapshot 2021-06 Release. Historically, this release has been associated with many names: “DBpedia Core”, “EN DBpedia”, and — most confusingly — just “DBpedia”. In fact, it is a combination of the EN Wikipedia data, 62 million community-contributed cross-references links as well as community extensions such as additional ontologies and taxonomies. Read the announcement on the blog

Tutorial at the LDK Conference and DBpedia Day at the SEMANTiCS Conference

At the beginning of September 2021 we jumped on a plane and gave a tutorial at the Language, Data and Knowledge (LDK) conference in Zaragoza, Spain. Building upon the success of the previous events held in Galway, Ireland in 2017, and in Leipzig, Germany in 2019, the conference brought together researchers from across disciplines concerned with the acquisition, curation and use of language data in the context of data science and knowledge-based applications. This tutorial was a great success and if you would like to catch up and check our slides, please click https://tinyurl.com/TutAtLDK. Few days later we travelled to Amsterdam, The Netherlands, to join this year’s SEMANTiCS Conference. 


The DBpedia Day was part of the conference and was held on the last day of the conference on 9th of September at the Theater de Meervaart. Our CEO, Sebastian Hellmann, opened the DBpedia Day with an update about the DBpedia Databus and our members. He presented the huge and diverse network DBpedia has built up in the last 13 years. Afterwards, Maria-Esther Vidal, TIB, completed the opening session with her keynote “Enhancing Linked Data Trustability and Transparency through Knowledge-driven Data Ecosystems”. Furthermore, we organized a member presentation session, an ontology and a NLP session, where experts presented NLP and DBpedia-related topics. In case you missed the event, all slides are also available on our event page. Further insights, feedback and photos about the event are available on Twitter via #DBpediaDay

Member Features on the Blog

At the beginning of November 2020 we started the member feature on our blog. In 2021 we continued and published further interesting posts and news about our members. We gave our members the chance to present special products, tools and applications. We published several posts in which members, i.e.Triply, WorldLift, Wallscope, eccenca, Diffbot, and the Network Institute (NI) of VU Amsterdam, shared unique insights with the community. Next year we will continue with interesting posts and presentations. Stay tuned!

DBpedia Snapshot 2021-09 Release

On October 22, 2021 we announced the immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset. Since the last release we made a few changes. Release notes are now maintained in the Databus collection (2021-09), we improved the image and abstract extractor and the DBpedia team worked on the community issue reporting and fix tracker at Github. The full release description including further statistics can be found on https://www.dbpedia.org/blog/snapshot-2021-09-release/.   

DBpedia Knowledge Graph Tutorial for Beginners

On 2nd of December, 2021 we organized the masterclass “Knowledge Graph tutorial for beginners” at the Connected Data World event. In this masterclass, participants learned how to consume the DBpedia Knowledge Graph with the least amount of effort. Furthermore, the masterclass introduced the DBpedia KG and we explained its dataset partitions. In case you missed the event, please watch the recorded session here.  

We do hope we will meet you and some new faces during our events next year. The association wants to get to know you because DBpedia is a community effort and would not continue to develop, improve and grow without you. We plan to have meetings or tutorials at the Data Week in Leipzig, the Web Conference’22, and the SEMANTiCS’22 conference. We wish you a happy New Year!

Stay safe and check Twitter, Instagram and LinkedIn or or subscribe to our Newsletter for the latest news and information.

Yours,

Julia 

on behalf of the DBpedia Association

The post A year with DBpedia – Retrospective Part 2/2021 appeared first on DBpedia Association.

]]>
DBpedia Snapshot 2021-09 Release https://www.dbpedia.org/blog/snapshot-2021-09-release/ Fri, 22 Oct 2021 14:07:07 +0000 https://www.dbpedia.org/?p=5010 We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset. News since DBpedia Snapshot 2021-06 Release notes are now maintained in the Databus Collection (2021-09) Image and Abstract Extractor was improved Work in […]

The post DBpedia Snapshot 2021-09 Release appeared first on DBpedia Association.

]]>
We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset.

News since DBpedia Snapshot 2021-06

  • Release notes are now maintained in the Databus Collection (2021-09)
  • Image and Abstract Extractor was improved
  • Work in progress: Smoothing the community issue reporting and fixing at Github

What is the “DBpedia Snapshot” Release?

Historically, this release has been associated with many names: “DBpedia Core”, “EN DBpedia”, and — most confusingly — just “DBpedia”. In fact, it is a combination of —

  • EN Wikipedia data — A small, but very useful, subset (~ 1 Billion triples or 14%) of the whole DBpedia extraction using the DBpedia Information Extraction Framework (DIEF), comprising structured information extracted from the English Wikipedia plus some enrichments from other Wikipedia language editions, notably multilingual abstracts in ar, ca, cs, de, el, eo, es, eu, fr, ga, id, it, ja, ko, nl, pl, pt, sv, uk, ru, zh.

  • Links — 62 million community-contributed cross-references and owl:sameAs links to other linked data sets on the Linked Open Data (LOD) Cloud that allow to effectively find and retrieve further information from the largest, decentral, change-sensitive knowledge graph on earth that has formed around DBpedia since 2007.

  • Community extensions — Community-contributed extensions such as additional ontologies and taxonomies.

Release Frequency & Schedule

Going forward, releases will be scheduled for the 15th of February, May, July, and October (with +/- 5 days tolerance), and are named using the same date convention as the Wikipedia Dumps that served as the basis for the release. An example of the release timeline is shown below:

September 6–8 Sep 8–20 Sep 20–Oct 10 Oct 10–20
Wikipedia dumps for June 1 become available on
https://dumps.wikimedia.org/
Download and extraction with DIEF Post-processing and quality-control period Linked Data and SPARQL endpoint deployment

Data Freshness

Given the timeline above, the EN Wikipedia data of DBpedia Snapshot has a lag of 1-4 months. We recommend the following strategies to mitigate this:

  1. DBpedia Snapshot as a kernel for Linked Data: Following the Linked Data paradigm, we recommend using the Linked Data links to other knowledge graphs to retrieve high-quality and recent information. DBpedia’s network consists of the best knowledge engineers in the world, working together, using linked data principles to build a high-quality, open, decentralized knowledge graph network around DBpedia. Freshness and change-sensitivity are two of the greatest data-related challenges of our time, and can only be overcome by linking data across data sources. The “Big Data” approach of copying data into a central warehouse is inevitably challenged by issues such as co-evolution and scalability.

  2. DBpedia Live: Wikipedia is unmistakenly the richest, most recent body of human knowledge and source of news in the world. DBpedia Live is just minutes behind edits on Wikipedia,  which means that as soon as any of the 120k Wikipedia editors press the “save” button, DBpedia Live will extract fresh data and update. DBpedia Live is currently in tech preview status and we are working towards a high-available and reliable business API with support. DBpedia Live consists of the DBpedia Live Sync API (for syncing into any kind of on-site databases), Linked Data and SPARQL endpoint.

  3. Latest-Core is a dynamically updating Databus Collection. Our automated extraction robot “MARVIN” publishes monthly dev versions of the full extraction, which are then refined and enriched to become Snapshot.

Data Quality & Richness

We would like to acknowledge the excellent work of Wikipedia editors (~46k active editors for EN Wikipedia), who are ultimately responsible for collecting information in Wikipedia’s infoboxes, which are refined by DBpedia’s extraction into our knowledge graphs. Wikipedia’s infoboxes are steadily growing each month and according to our measurements grow by 150% every three years. EN Wikipedia’s inboxes even doubled in this timeframe. This richness of knowledge drives the DBpedia Snapshot knowledge graph and is further potentiated by synergies with linked data cross-references. Statistics are given below.

Data Access & Interaction Options

Linked Data

Linked Data is a principled approach to publishing RDF data on the Web that enables interlinking data between different data sources, courtesy of the built-in power of Hyperlinks as unique Entity Identifiers.

HTML pages comprising Hyperlinks that confirm to Linked Data Principles is one of the methods of interacting with data provided by the DBpedia Snapshot, be it manually via the web browser or programmatically using REST interaction patterns via https://dbpedia.org/resource/{entity-label} pattern. Naturally, we encourage Linked Data interactions, while also expecting user-agents to honor the cache-control HTTP response header for massive crawl operations. Instructions for accessing Linked Data, available in 10 formats.

SPARQL Endpoint

This service enables some astonishing queries against Knowledge Graphs derived from Wikipedia content. The Query Services Endpoint that makes this possible is identified by http://dbpedia.org/sparql, and it currently handles 7.2 million queries daily on averageSee powerful queries and instructions (incl. rates and limitations).

An effective Usage Pattern is to filter a relevant subset of entity descriptions for your use case via SPARQL and then combine with the power of Linked Data by looking up (or de-referencing) data via owl:sameAs property links en route to retrieving specific and recent data from across other Knowledge Graphs across the massive Linked Open Data Cloud.

Additionally, DBpedia Snapshot dumps and additional data from the complete collection of datasets derived from Wikipedia are provided by the DBpedia Databus for use in your own SPARQL-accessible Knowledge Graphs.

DBpedia Ontology

This Snapshot Release was built with DBpedia Ontology (DBO) version: https://databus.dbpedia.org/ontologies/dbpedia.org/ontology–DEV/2021.07.09-070001 We thank all DBpedians for the contribution to the ontology and the mappings. See documentation and visualizations, class tree and properties, wiki.

DBpedia Snapshot Statistics

Overview. Overall the current Snapshot Release contains more than 850 million facts (triples).

At its core, the DBpedia ontology is the heart of DBpedia. Our community is continuously contributing to the DBpedia ontology schema and the DBpedia infobox-to-ontology mappings by actively using the DBpedia Mappings Wiki.

The current Snapshot Release utilizes a total of 55 thousand properties, whereas 1377 of these are defined by the DBpedia ontology.

Classes. Knowledge in Wikipedia is constantly growing at a rapid pace. We use the DBpedia Ontology Classes to measure the growth: Total number in this release (in brackets we give: a) growth to the previous release, which can be negative temporarily and b) growth compared to Snapshot 2016-10):

  • Persons: 1,730,033 (2.28%, 8.85%)

  • Places: 737,512 (-25.64%, -11.42%),
    including but not limited to 582,191 (-0.14%, 13.35%) populated
    places

  • Works 603,110 (1.34%, 21.58%), including,
    but not limited to

    • 157,137 (1.94%, 38.02%) music albums

    • 142,135 (0.75%, 1466.74%) films

    • 24,452 (0.85%, 1133.70%) video
      games

  • Organizations:
    339,927 (-0.13%, -42.35%), including but not limited to

    • 85,726 (1.20%, 59.79%) companies

    • 64,474 (1.01%, 17.68%)
      educational institutions

  • Species: 160,535 (-3.71%, -46.47%)

  • Plants: 10,509 (-9.41%, 8.10%)

  • Diseases: 10,512 (-9.39%, 747.74%)

Detailed Growth of Classes: The image below shows the detailed growth for one class. Click on the links for other classes: Place, PopulatedPlace, Work, Album, Film, VideoGame, Organisation, Company, EducationalInstitution, Species, Plant, Disease. For further classes adapt the query by replacing the <http://dbpedia.org/ontology/CLASS> URI. Note, that 2018 was a development phase with some failed extractions. The stats were generated with the Databus VOID Mod.

Links. Linked Data cross-references between decentral datasets are the foundation and access point to the Linked Data Web. The latest Snapshot Release provides over 127.8 million links from 7.47 million entities to 179 external sources.

Top 11

33,403,279 www.wikidata.org

6,847,067 global.dbpedia.org

4,308,772 yago-knowledge.org

3,712,468 de.dbpedia.org

3,589,032 fr.dbpedia.org

2,917,799 viaf.org

2,841,527 it.dbpedia.org

2,816,382 es.dbpedia.org

2,567,507 fa.dbpedia.org

2,542,619 sr.dbpedia.org

Top 10 without DBpedia namespaces

33,403,279 www.wikidata.org

4,308,772 yago-knowledge.org

2,917,799 viaf.org

1,614,381 d-nb.info

596,134 umbel.org

581,558 sws.geonames.org

521,985 data.bibliotheken.nl

430,839 www.w3.org

369,309 musicbrainz.org

104,433 linkedgeodata.org

DBpedia Extraction Dumps on the Databus

All extracted files are reachable via the DBpedia account on the Databus. The Databus has two main structures:

Snapshot Download. For downloading DBpedia Snapshot, we prepared this collection, which also includes detailed releases notes: https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2021-06 The collection is roughly equivalent to http://downloads.dbpedia.org/2016-10/core/

Collections can be downloaded in many different ways, some download modalities such as bash script, SPARQL, and plain URL list are found in the tabs at the collection. Files are provided as bzip2 compressed n-triples files. In case you need a different format or compression, you can also use the “Download-As” function of the Databus Client (GitHub), e.g. –s $collection -c gzip would download the collection and convert it to GZIP during download.

Replicating DBpedia Snapshot on your server can be done via Docker, see https://hub.docker.com/r/dbpedia/virtuoso-sparql-endpoint-quickstart

git clone https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart.git

cd virtuoso-sparql-endpoint-quickstart

COLLECTION_URI=https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2021-09 VIRTUOSO_ADMIN_PASSWD=password docker-compose up

Download files from the whole DBpedia extraction. The whole extraction consists of approx. 20 Billion triples and 5000 files created from 140 languages of Wikipedia, Commons  and Wikidata. They can be found in https://databus.dbpedia.org/dbpedia/(generic|mappings|text|wikidata)

You can copy-edit a collection and create your own customized (e.g.) collections via “Actions” -> “Copy Edit” , e.g. you can Copy Edit the snapshot collection above, remove some files that you do not need and add files from other languages. Please see the Rhizomer use case: Best way to download specific parts of DBpedia. Of course, this only refers to the archived dumps on the Databus for users who want to bulk download and deploy into their own infrastructure. Linked Data and SPARQL allow for filtering the content using a small data pattern.

Acknowledgments

First and foremost, we would like to thank our open community of knowledge engineers for finding & fixing bugs and for supporting us by writing data tests. We would also like to acknowledge the DBpedia Association members for constantly innovating the areas of knowledge graphs and linked data and pushing the DBpedia initiative with their know-how and advice. OpenLink Software supports DBpedia by hosting SPARQL and Linked Data; University Mannheim, the German National Library of Science and Technology (TIB) and the Computer Center of University Leipzig provide persistent backups and servers for extracting data. We thank Marvin Hofer and Mykola Medynskyi for technical preparation. This work was partially supported by grants from the Federal Ministry for Economic Affairs and Energy of Germany (BMWi) for the LOD-GEOSS Project (03EI1005E), as well as for the PLASS Project (01MD19003D).

The post DBpedia Snapshot 2021-09 Release appeared first on DBpedia Association.

]]>
Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/ https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/#respond Tue, 11 May 2021 08:45:28 +0000 https://www.dbpedia.org/?p=4575 On Tuesday the 4th of May, DBpedia organized a tutorial at the Knowledge Graph Conference (KGC) 2021. The ultimate goal of the tutorial was to teach the participants all relevant tech around DBpedia, the knowledge graph, the infrastructure and possible use cases. The tutorial aimed at existing and potential new users of DBpedia, developers that […]

The post Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 appeared first on DBpedia Association.

]]>
On Tuesday the 4th of May, DBpedia organized a tutorial at the Knowledge Graph Conference (KGC) 2021. The ultimate goal of the tutorial was to teach the participants all relevant tech around DBpedia, the knowledge graph, the infrastructure and possible use cases. The tutorial aimed at existing and potential new users of DBpedia, developers that wish to learn how to replicate DBpedia infrastructure, service providers, data providers as well as data scientists.

Following, we will give you a brief retrospective about the presentations. For further details of the presentations follow the link to the slides.

Opening

The tutorial which was held online was opened by Milan Dojchinovski (InfAI / DBpedia Association /  CTU in Prague) with some general information about the program of the tutorial, the scope and the technical information.

DBpedia in a Nutshell session

After the short opening, Milan continued with the first topic, the background on the DBpedia Association – how it all started and the evolution of DBpedia. Linked Data and the LOD cloud were also addressed as well as the mappings, extractors and data groups (e.g. mappings, generic, text, wikidata). Then Ontology was presented and explained. Milan concluded the first topic with information on the DBpedia SPARQL endpoint and DBpedia Databus platform.

Getting Started with DBpedia session

The next point on the program was split into two subtopics. First of all, Jan Forberg (InfAI / DBpedia Association) explained where to find data including DBpedia SPARQL endpoint, the DBpedia Databus platform as a repository for DBpedia and related datasets and the novel “collections” concept. Moreover, the DBpedia services such as DBpedia Lookup and DBpedia Spotlight were presented.

Afterwards Jan explained how to use the data hosted on the Databus. Starting by selecting particular artifacts, he explained the Docker container where data can be downloaded and a simple bash script to submit SPARQL and retrieve specific data artifacts.

Building National Knowledge Graphs using DBpedia Tech

In the following session, Johannes Frey (InfAI / DBpedia Association) explied how to build national knowledge graphs using DBpedia Tech. The use case of the Dutch National Knowledge Graph was explained as an example. The Dutch National Knowledge Graph was presented during the DBpedia Hackathon 2020. For further information feel free to have a look at the presentations of the Hackathon 2020 here https://tinyurl.com/kgia-2020-dnkg.  Please also see all relevant data here https://databus.dbpedia.org/dnkg/fusion/dutch-national-kg/

DBpedia Technology Stack

Talking about DBpedia Technology Stack, Jan started with the DBpedia Databus platform. He explained how the Databus platform works, the benefits (DatalDs and Simple Retrieval), the Databus SPARQL endpoints and the Web API and Maven Plugin. After that, Jan presented the Dockersized Services including DBpedia Virtuoso and DBpedia Plugin, DBpedia Spotlight (incl. use cases) and DBpedia Lookup.

Marvin Hofer (InfAI / DBpedia Association) then explained the DBpedia release process on the Databus and presented his work on debugging DBpedia and the DBpedia Mods technology. Marvin also explained the quality assurance process using the concept of  minidumps.

Afterwards, Johannes explained (Pre)fusion, ID management and the novel concept of cartridges.

Subsequently, Denis Streitmatter (InfAI / DBpedia Association) presented the DBpedia Archivo ontology manager and how to include your ontology here. He also explained various  use cases, e.g. how to find ontology, how to test your ontology and how to back it up. Then he presented the ontology tests 5 star schema and the SHACL based tests for ontologies. Please read the official DBpedia Archivo call here https://www.dbpedia.org/blog/dbpedia-archivo-call-to-improve-the-web-of-ontologies/

Contributions to DBpedia

As it got to the end of the tutorial, Milan explained how to improve mappings or introduce new mappings. He talked about improvement of the DBpedia Information Extraction Framework as well as contributing DBpedia tests. Then he explained about contributing mappings and links for knowledge cartridges and how to write Mods for the Databus.

In case you missed the event, our presentation is also available on the DBpeda event page. Further insights, feedback and photos about the event are available on Twitter (#DBpediaTutorial hashtag).

We are now looking forward to the next DBpedia tutorial, which will be held on September 1, 2021 co-located with the LDK conference in Zaragoza, Spain. Check more details here and register now! Furthermore, we will organize the DBpedia Day on September 9, 2021 at the Semantics Conference in Amsterdam. We are looking forward to meeting all Dutch DBpedians there! 

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Yours DBpedia Association

The post Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/feed/ 0
More than 50 DBpedia enthusiasts joined the Community Meeting in Karlsruhe. https://www.dbpedia.org/blog/community-meeting-in-karlsruhe/ Thu, 19 Sep 2019 13:07:07 +0000 https://blog.dbpedia.org/?p=1229 SEMANTiCS is THE leading European conference in the field of semantic technologies and the platform for professionals who make semantic computing work, and understand its benefits and know its limitations. Following, we will give you a brief retrospective about the presentations. Opening Session Katja Hose – “Querying the web of data” ….on the search for […]

The post More than 50 DBpedia enthusiasts joined the Community Meeting in Karlsruhe. appeared first on DBpedia Association.

]]>
SEMANTiCS is THE leading European conference in the field of semantic technologies and the platform for professionals who make semantic computing work, and understand its benefits and know its limitations.

Following, we will give you a brief retrospective about the presentations.

Opening Session

Katja Hose – “Querying the web of data”

….on the search for the killer App.

The concept of Linked Open Data and the promise of the Web of Data have been around for over a decade now. Yet, the great potential of free access to a broad range of data that these technologies offer has not yet been fully exploited. This talk will, therefore review the current state of the art, highlight the main challenges from a query processing perspective, and sketch potential ways on how to solve them. Slides are available here.

Dan Weitzner – “timbr-DBpedia – Exploration and Query of DBpedia in SQL

The timbr SQL Semantic Knowledge Platform enables the creation of virtual knowledge graphs in SQL. The DBpedia version of timbr supports query of DBpedia in SQL and seamless integration of DBpedia data into data warehouses and data lakes. We already published a detailed blogpost about timbr where you can find all relevant information about this amazing new DBpedia Service.

Showcase Session

Maribel Acosta“A closer look at the changing dynamics of DBpedia mappings”

Her presentation looked at the mappings wiki and how different language chapters use and edit it. Slides are available here.

Mariano Rico“Polishing a diamond: techniques and results to enhance the quality of DBpedia data”

DBpedia is more than a source for creating papers. It is also being used by companies as a remarkable data source. This talk is focused on how we can detect errors and how to improve the data, from the perspective of academic researchers and but also on private companies. We show the case for the Spanish DBpedia (the second DBpedia in size after the English chapter) through a set of techniques, paying attention to results and further work. Slides are available here.

Guillermo Vega-Gorgojo – “Clover Quiz: exploiting DBpedia to create a mobile trivia game”

Clover Quiz is a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia. Questions are created off-line through a data extraction pipeline and a versatile template-based mechanism. A back-end server manages the question set and the associated images, while a mobile app has been developed and released in Google Play. The game is available free of charge and has been downloaded by +10K users, answering more than 1M questions. Therefore, Clover Quiz demonstrates the advantages of semantic technologies for collecting data and automating the generation of multiple-choice questions in a scalable way. Slides are available here.

Fabian Hoppe and Tabea Tiez – “The Return of German DBpedia”

Fabian and Tabea will present the latest news on the German DBpedia chapter as it returns to the language chapter family after an extended offline period. They will talk about the data set, discuss a few challenges along the way and give insights into future perspectives of the German chapter. Slides are available here.

Wlodzimierz Lewoniewski and Krzysztof Węcel  – “References extraction from Wikipedia infoboxes”

In Wikipedia’s infoboxes, some facts have references, which can be useful for checking the reliability of the provided data. We present challenges and methods connected with the metadata extraction of Wikipedia’s sources. We used DBpedia Extraction Framework along with own extensions in Python to provide statistics about citations in 10 language versions. Provided methods can be used to verify and synchronize facts depending on the quality assessment of sources. Slides are available here.

Wlodzimierz Lewoniewski – “References extraction from Wikipedia infoboxes” … He gave insight into the process of extracting references for Wikipedia infoboxes, which we will use in our GFS project.

Afternoon Session

Sebastian Hellmann, Johannes Frey, Marvin Hofer – “The DBpedia Databus – How to build a DBpedia for each of your Use Cases”

The DBpedia Databus is a platform that is intended for data consumers. It will enable users to build an automated DBpedia-style Knowledge Graph for any data they need. The big benefit is that users not only have access to data, but are also encouraged to apply improvements and, therefore, will enhance the data source and benefit other consumers. We want to use this session to officially introduce the Databus, which is currently in beta and demonstrate its power as a central platform that captures decentrally created client-side value by consumers.  

We will give insight on how the new monthly DBpedia releases are built and validated to copy and adapt for your use cases. Slides are available here.

Interactive session, moderator: Sebastian Hellmann – “DBpedia Connect & DBpedia Commerce – Discussing the new Strategy of DBpedia”

In order to keep growing and improving, DBpedia has been undergoing a growth hack for the last couple of months. As part of this process, we developed two new subdivisions of DBpedia: DBpedia Connect and DBpedia Commerce. The former is a low-code platform to interconnect your public or private databus data with the unified, global DBpedia graph and export the interconnected and enriched knowledge graph into your infrastructure. DBpedia Commerce is an access and payment platform to transform Linked Data into a networked data economy. It will allow DBpedia to offer any data, mod, application or service on the market. During this session, we will provide more insight into these as well as an overview of how DBpedia users can best utilize them. Slides are available here.

In case you missed the event, all slides and presentations are also available on our Website. Further insights, feedback and photos about the event are available on Twitter via #DBpediaDay

We are now looking forward to more DBpedia meetings next year. So, stay tuned and check Twitter, Facebook and the Website or subscribe to our Newsletter for the latest news and information.

If you want to organize a DBpedia Community meeting yourself, just get in touch with us via dbpedia@infai.org regarding program and organization.

Yours

DBpedia Association

The post More than 50 DBpedia enthusiasts joined the Community Meeting in Karlsruhe. appeared first on DBpedia Association.

]]>
timbr – the DBpedia SQL Semantic Knowledge Platform https://www.dbpedia.org/blog/timbr-the-dbpedia-sql-semantic-knowledge-platform/ Thu, 18 Jul 2019 09:39:33 +0000 https://blog.dbpedia.org/?p=1171 With timbr, WPSemantix and the DBpedia Association launch the first SQL Semantic Knowledge Graph that integrates Wikipedia and Wikidata Knowledge into SQL engines. In part three of DBpedia’s growth hack blog series, we feature timbr, the latest development at DBpedia in collaboration with WPSemantix. Read on to find out how it works. timbr – DBpedia […]

The post timbr – the DBpedia SQL Semantic Knowledge Platform appeared first on DBpedia Association.

]]>
With timbr, WPSemantix and the DBpedia Association launch the first SQL Semantic Knowledge Graph that integrates Wikipedia and Wikidata Knowledge into SQL engines.

In part three of DBpedia’s growth hack blog series, we feature timbr, the latest development at DBpedia in collaboration with WPSemantix. Read on to find out how it works.

timbr – DBpedia SQL Semantic Knowledge Platform

Tel Aviv, Israel and Leipzig, Germany – July 18, 2019 – WP-Semantix (WPS) – the “SQL Knowledge Graph Company™ and DBpedia Association – Institut für Angewandte Informatik e.V., announced today the launch of the timbr-DBpedia SQL Semantic Knowledge Platform, a unique version of WPS’ timbr SQL Semantic Knowledge Graph that integrates timbr-DBpedia ontology, timbr’s ontology explorer/visualizer and timbr’s SQL query service, to provide for the first time semantic access to DBpedia knowledge in SQL and to thus facilitate DBpedia knowledge integration into standard data warehouses and data lakes.

DBpedia

DBpedia is the crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects and publish these as files on the Databus and via online databases. This structured information resembles an open knowledge graph which has been available for everyone on the Web for over a decade. Knowledge graphs are a new kind of databases developed to store knowledge in a machine-readable form, organized as connected, relationship-rich data. After the publication of DBpedia (in parallel to Freebase) 12 years ago, knowledge graphs have become very successful and Google uses a similar approach to create the knowledge cards displayed in search results.

Query the world’s knowledge in standard SQL

Amit Weitzner, founder and CEO at WPS commented: “Knowledge graphs use specialized languages, require resource-intensive, dedicated infrastructure and require costly ETL operations. That is, they did until timbr came along. timbr employs SQL – the most widely known database language, to eliminate the technological barriers to entry for using knowledge graphs and to implement Semantic Web principles to provide knowledge graph functionality in SQL. timbr enables modelling of data as connected, context-enriched concepts with inference and graph traversal capabilities while being queryable in standard SQL, to represent knowledge in data warehouses and data lakes. timbr-DBpedia is our first vertical application and we are very excited by the prospects of our cooperation with the DBpedia team to enable the largest user base to query the world’s knowledge in standard SQL.”

Sebastian Hellmann, executive director of the DBpedia Association commented that:

“our vision of the DBpedia Databus – transforming Linked Data into a networked data economy, is becoming a reality thanks to tools such as timbr-DBpedia which take full advantage of our unique data sets and data architecture. We look forward to working with WPS to also enable access to new data sets as they become available .”

timbr will help to explore the power of semantic technologies

Prof. James Hendler, pioneer and a world-leading authority in Semantic Web technologies and WPS’ advisory board member commented “timbr can be a game-changing solution by enabling the semantic inference capabilities needed in many modelling applications to be done in SQL. This approach will enable many users to get the advantages of semantic AI technologies and data integration without the learning curve of many current systems. By giving more people access to the semantic version of Wikipedia, timbr-DBpedia will definitely contribute to allowing the majority of the market to explore the power of semantic technologies.”

timbr-DBpedia is available as a query service or licensed for use as SaaS or on-premises. See the DBpedia website: wiki.dbpedia.org/timbr.

About WPSemantix

WP-Semantix Ltd. (wpsemantix.com) is the developer of the timbr SQL semantic knowledge platform, a dynamic abstraction layer over relational and non-relational data, facilitating declaration and powerful exploration of semantically rich ontologies using a standard SQL query interface. timbr is natively accessible in Apache Spark, Python, R and SQL to empower data scientists to perform complex analytics and generate sophisticated ML algorithms.  Its JDBC interface provides seamless integration with the most popular business intelligence solutions to make complex analytics accessible to analysts and domain experts across the organization.

WP-Semantix, timbr, “SQL Knowledge Graph”, “SQL Semantic Knowledge Graph” and associated marks and trademarks are registered trademarks of WP Semantix Ltd.

DBpedia is looking forward to this cooperation. Follow us on Twitter for the latest information and stay tuned for part four of our growth hack series. The next post features the GlobalFactSyncRe. Curious? You have to be a little more patient and wait till Thursday, July 25th.

Yours DBpedia Association

The post timbr – the DBpedia SQL Semantic Knowledge Platform appeared first on DBpedia Association.

]]>
Home Sweet Home – The 13th DBpedia Community Meeting https://www.dbpedia.org/blog/home-sweet-home-the-13th-dbpedia-community-meeting/ Tue, 18 Jun 2019 12:52:59 +0000 https://blog.dbpedia.org/?p=1137 After a very successful LDK conference May 20th-21st, representatives of the European DBpedia community met at Villa Ida Mediencampus,  on Thursday, May 23rd, to present their work with DBpedia and to exchange about the DBpedia Databus.  

The post Home Sweet Home – The 13th DBpedia Community Meeting appeared first on DBpedia Association.

]]>
For the second time now, we co-located one of our DBpedia community meetings with the LDK-conference. After the previous edition in Galway two years ago, It was Leipzig’s turn to host the event. Thus, the 13th DBpedia community meeting took place in this beautiful city which is also home to the DBpedia Association’s head office. Win, Win we’d say. 

After a very successful LDK conference May 20th-21st, representatives of the European DBpedia community met at Villa Ida Mediencampus,  on Thursday, May 23rd, to present their work with DBpedia and to exchange about the DBpedia Databus.  

For those of you who missed it or for those who want a little retrospective on the day, this blog post provides you with a short LDK-wrap-up as well as a recap of our DBpedia Day.

First things first

First and foremost, we would like to thank LDK organizers for co-locating our meeting and thus enabling fruitful synergies, and a platform for the DBpedia community to exchange.

LDK

The first presentation that kicked-off the conference was given by Prof. Christiane Fellbaum from Princeton University. The topic of her talk was on “Mapping the Lexicons of Signs and Words” with the main focus on her research of mapping WordNet and SignStudy, a resource for American Sign Language. Shortly after, Prof Eduard Werner from Leipzig University gave a very exciting talk on the “Sorbian languages”. He discussed the nature of the Sorbian languages, their historical background, and the unfortunate imminent extinction of lower Sorbian due to a decline of native speakers.

The first day of LDK was full of exciting presentations related to various language-oriented topics. Researchers exchanged about linguistic vocabularies, SPARQL query recommendations, role and reference grammar, language detection, entity recognition, machine translation, under-resourced languages, metaphor identification, event detection and linked data in general. The first day ended with fruitful discussions during the poster session. After at the end of the first conference day, LDK visitors had the chance to mingle with locals in some of Leipzig’s most exciting bars during a pub crawl.

Prof. Christian Bizer from the University of Mannheim opened the second day with a keynote on “Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the Web?”. In his talk, he gave a nice overview of the research on knowledge extraction around the large-scale Web Data Commons corpus, findings, open challenges and possible exploitations of this corpus.

The second day was busy with four sessions, each populated with presentations on exciting topics ranging from relation classification, dictionary linking and entity linking, to terminology models, topical thesauri and morphology.

The series of presentations was ended with an Organ Prelude played by David Timm, the University Music Director at the Leipzig University. Finally, the day and the conference was concluded with a conference dinner at Moritzbastei, one of Leipzig’s famous cultural centres.

DBpedia Day

On May 23rd, the DBpedia Community met for the 13th DBpedia community meeting. The event attracted more than 60 participants who extended their LDK experience or followed our call to Leipzig.

Opening & keynotes

The meeting was opened by Dr. Sebastian Hellmann, the executive director of the DBpedia Association. He gave an overview of the latest developments and achievements around DBpedia, with the main focus on the DBpedia Databus technologies. The first keynote was given by Dr. Peter Haase, from metaphacts, with an unusual interactive presentation on “Linked Data Fun with DBpedia”. The second keynote speaker was Prof. Heiko Paulheim, presenting findings, challenges and results from his work on the construction of the DBkWiki Knowledge Graph by exploiting the DBpedia extraction framework.

Showcase session

The showcases session started with a presentation given by  Krzysztof Węcel on “Citations and references in DBpedia”, followed by Peter Nancke with a presentation on the “TeBaQA Question Answering System”, Maribel Acosta Deibe speaking about “Crowdsourcing the Quality of DBpedia” and finally, a presentation by Angus Addlesee on “Data Reconciliation using DBpedia”.

NLP & DBpedia session

The DBpedia & NLP session was opened by  Diego Moussallem presenting the results from his work on “Generating Natural Language from RDF Data”. The second presentation was given by Christian Jilek on the topic of “Named Entity Recognition for Real-Time Applications”, which at the same time won the best research paper at the LDK conference. Next, Jonathan Kobbe presented the best student paper at the LDK conference on the topic of “Argumentative Relation Classification”. Finally, Edgard Marx closed the session with an overview presentation on “From the word to the resource”.

 

Side-Event – Hackathon

The “Artificial Intelligence for Smart Agriculture” Hackathon focused on enhancing the usability of automatic analysis tools which utilize semantic big data for agriculture, as well as conducting an outreach of the DataBio project for the DBpedia community. The event was supported by PNO, Spacebel, PSNC, and InfAI e.V.

We improved the visualization module of Albatross, a platform for processing and analyzing Linked Open Data, and added functionalities to geo-L, the geospatial link discovery tool.  

In addition, we presented a paper about Linked Data publication pipelines, focusing on agri-related data, at the co-located LSWT conference.

Wrap Up

After the event, DBpedians joined the DBpedia Association in the nearby pub Gosenschenke to delve into more vital talks about the Semantic Web world, Linked Data & DBpedia.

In case you missed the event, all slides and presentations are available on our website. Further insights feedback and photos about the event can be found on Twitter via #DBpediaLeipzig.

We are currently looking forward to the next DBpedia Community Meeting, on Sept, 12th in Karlsruhe, Germany. This meeting is co-located with the SEMANTiCS Conference. Contributions are still welcome. Just ping us via dbpedia@infai.org and show us what you’ve got. You should also get in touch with us if you want to host a DBpedia Meetup yourself. We will help you with the program, the dissemination or organizational matters of the event if need be.

Stay tuned, check Twitter, Facebook, and the website, or subscribe to our newsletter for the latest news and updates.

 

Your DBpedia Association

The post Home Sweet Home – The 13th DBpedia Community Meeting appeared first on DBpedia Association.

]]>
Vítejte v Praze! https://www.dbpedia.org/blog/community-meetup-prague-2019/ Wed, 13 Feb 2019 12:18:06 +0000 https://blog.dbpedia.org/?p=1103 After our meetups in Poland and France last year, we delighted the Czech DBpedia community with a DBpedia meetup. It was co-located with the XML Prague conference on February 7th, 2019. First and foremost, we would like to thank Jirka Kosek (University of Economics, Prague), Milan Dojchinovski (AKSW/KILT, Czech Technical University in Prague), Tomáš Kliegr […]

The post Vítejte v Praze! appeared first on DBpedia Association.

]]>
After our meetups in Poland and France last year, we delighted the Czech DBpedia community with a DBpedia meetup. It was co-located with the XML Prague conference on February 7th, 2019.

First and foremost, we would like to thank Jirka Kosek (University of Economics, Prague), Milan Dojchinovski (AKSW/KILT, Czech Technical University in Prague), Tomáš Kliegr (KIZI/University of Economics, Prague) and, the XML Prague conference for co-hosting and support the event.

Opening the DBpedia community meetup

The Czech DBpedia community and the DBpedia Databus were in the focus of this meetup. Therefore, we invited local data scientists as well as DBpedia enthusiasts to discuss the state-of-the-art of the DBpedia databus. Sebastian Hellmann (AKSW/KILT) opened the meeting with an introduction to DBpedia and the DBpedia Databus. Following, Marvin Hofer explained how to use the DBpedia databus in combination with the Docker technology and, Johannes Frey (AKSW/KILT) presented the methods behind the DBpedia’s Data Fusion and Global ID Management.

Showcase Session

Marek Dudáš (KIZI/UEP) started the DBpedia Showcase Session with a presentation on “Concept Maps with the help of DBpedia”, where he showed the audience how to create a “concept map” with the ContextMinds application. Furthermore, Tomáš Kliegr (KIZI/UEP) presented “Explainable Machine Learning and Knowledge Graphs”. He explained his contribution to a rule-based classifier for business use cases. Two other showcases followed: Václav Zeman (KIZI/UEP), who presented “RdfRules: Rule Mining from DBpedia” and Denis Streitmatter (AKSW/KILT), who demonstrated the “DBpedia API”.

Miroslav Blasko presents “Ontology-based Dataset Exploration”

Closing this Session, Miroslav Blasko (CTU, Prague) gave a presentation on “Ontology-based Dataset Exploration”. He explained a taxonomy developed for dataset description. Additionally, he presented several use cases that have the main goal of improving content-based descriptors.

Summing up, the DBpedia meetup in Prague brought together more than 50 DBpedia enthusiasts from all over Europe. They engaged in vital discussions about Linked Data, the DBpedia databus, as well as DBpedia use cases and services.

In case you missed the event, all slides and presentations are available on our website. Further insights  feedback, and photos about the event can be found on Twitter via #DBpediaPrague.

We are currently looking forward to the next DBpedia Community Meeting, on May 23rd, 2019 in Leipzig, Germany. This meeting is co-located with the Language, Data and Knowledge (LDK) conference. Stay tuned and check Twitter, Facebook and the website or subscribe to our newsletter for the latest news and updates.

Your DBpedia Association

The post Vítejte v Praze! appeared first on DBpedia Association.

]]>