DBpedia Association https://www.dbpedia.org/ Global and Unified Access to Knowledge Graphs Fri, 01 Mar 2024 09:21:24 +0000 en-GB hourly 1 https://wordpress.org/?v=6.4.3 https://www.dbpedia.org/wp-content/uploads/2020/09/cropped-dbpedia-webicon-32x32.png DBpedia Association https://www.dbpedia.org/ 32 32 GSoC 2024 – Call for Contributors https://www.dbpedia.org/blog/gsoc-2024-call-for-contributors/ https://www.dbpedia.org/blog/gsoc-2024-call-for-contributors/#respond Fri, 01 Mar 2024 09:21:00 +0000 https://www.dbpedia.org/?p=5717 Are you a student looking for a summer experience that combines coding skills with open source development? Then look no further than the Google Summer of Code program 2024, where you can join forces with DBpedia to help advance the state of the art in semantic web technologies. Build your skills and gain valuable experience […]

The post GSoC 2024 – Call for Contributors appeared first on DBpedia Association.

]]>
Are you a student looking for a summer experience that combines coding skills with open source development? Then look no further than the Google Summer of Code program 2024, where you can join forces with DBpedia to help advance the state of the art in semantic web technologies. Build your skills and gain valuable experience while making a real impact on the tech community!

We have been accepted to be part of this incredible program to support young ambitious developers who want to work with open-source organizations like DBpedia. So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Even though Covid-19 changed a lot in the world, it couldn’t shake Google Summer of Code (GSoC) much. The program, designed to mentor youngsters from afar is almost too perfect for us. One of the advantages of GSoC is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us.

DBpedia is now looking for contributors who want to work with us during the upcoming summer months.  

What is Google Summer of Code?

Google Summer of Code is a global program focused on bringing developers into open source software development. Funds will be given to all new beginner contributors to open source over 18 years to work for two and a half months (or longer) on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2024.

And this is how it works …

Step 1Check out one of our projects here or draft your own. 
Step 2Get in touch with our mentors as soon as possible and write up a project proposal of at least 8 pages. Information about our proposal structure and a template are available here.  
Step 3After a selection phase, contributors are matched with a specific project and mentor(s) and start working on the project. 

Application Procedure GSoC2024

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2024. Please also note the official GSoC 2024 timeline for your proposal submission and make sure to submit on time. Unfortunately, extensions cannot be granted. Final submission deadline is April 2, 2024 at 18:00 UTC.

Contact

Detailed information on how to apply are available on the DBpedia website. We’ve prepared an information kit for you. Please find all necessary information regarding the student application procedure here.

And in case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Finally, we are looking forward to your contribution!

Yours DBpedia Association

The post GSoC 2024 – Call for Contributors appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/gsoc-2024-call-for-contributors/feed/ 0
A year with DBpedia – Retrospective Part 2/2023 https://www.dbpedia.org/blog/a-year-with-dbpedia-retrospective-part-2-2023/ Thu, 04 Jan 2024 13:45:24 +0000 https://www.dbpedia.org/?p=5672 This is the final part of our journey through 2023. In the previous blog post we have presented the DBpedia highlights. Now we will take a look at the second half of 2023 and give an outlook for 2024. Tutorial @  Language, Data and Knowledge conference On 13th of September, 2023, an exciting tutorial took […]

The post A year with DBpedia – Retrospective Part 2/2023 appeared first on DBpedia Association.

]]>
This is the final part of our journey through 2023. In the previous blog post we have presented the DBpedia highlights. Now we will take a look at the second half of 2023 and give an outlook for 2024.

Tutorial @  Language, Data and Knowledge conference

On 13th of September, 2023, an exciting tutorial took place at the University of Vienna in the Center for Translation Studies as part of the LDK 2023. The LDK conference focuses on the acquisition, maintenance and use of language data in the context of data science and knowledge-based applications. The tutorial was opened by Milan Dojchinovski (InfAI, DBpedia Association, CTU in Prague). This was followed by three sessions, which were accompanied by many real-world practical use cases, on the DBpedia Knowledge Graph, the infrastructure and the use of the databus data publishing platform. Check more details on our events page

DBpedia Day @ SEMANTiCS in Leipzig 

DBpedia Day was once again part of the program at this year’s SEMANTICS conference 2023. It was held on 20th of September at the HYPERION Hotel Leipzig with up to 100 DBpedians. Once again this year, our CEO Sebastian Hellmann opened the day with a presentation of the “DBpedia Databus version 2.1.0”. This was followed by the exciting keynote speech “Towards Foundation Models for Data Spaces” by Edward Curry from the University of Galway, Ireland. Afterwards, we organized the member session and the DBpedia Science Talk session. All slides can also be found on our  events page.

Databus

Databus pre-launch announcement

We are in the final stage of the DBpedia Databus open software release (GitHub). Remaining issues include quality of life and UI improvements. Check out the Databus feature matrix for our lightweight, scalable, adaptable, powerful Data Catalog Platform (direct download link, persistent data identifier on the databus). Contact dbpedia@infai.org for demo, business, or research proposal inquiries.

Databus excels at cataloging de-central data of any filetype using RDF/DCAT. We selected a few initial focal use cases, where the Databus serves as:

  1. AIModelHub for AI training data, models, validation, and deployment.
  2. Research Data Management Catalog for research institutes and communities.
  3. Supply-Chain-Management Platform for product information collection along the supply chain and construction of Digital Product Passports.
  4. Community Data Portal, e.g., for the DBpedia Community.

DBpedia Contributions will be enabled soon, taking DBpedia to the moon! 🚀

In DBpedia’s future, the Databus will be used to collect community contributions more effectively, giving DBpedia an enormous boost in quantity and quality. https://databus.dbpedia.org already catalogs over 350k files with over 1 Million file downloads per month!  We are preparing showcases, templates, and documentation for these community contribution types:

  1. Community Extensions such as caligraph.org or AI-improved abstracts.
  2. Community Link Contributions for inclusion in the main graph.
  3. RDF profiles for DBpedia Users and Members (FOAF, Schema.org, WebID) via Databus Accounts (including publication of expertise).
  4. Dockerized RDF Tool Deployment so you can automatically load DBpedia and other RDF data into your favorite RDF tools via Databus collections. Our Databus-powered Virtuoso SPARQL Endpoint Quickstart Docker has already been deployed over 150k times! 

We do hope we will meet you and some new faces during our events next year. The association wants to get to know you because DBpedia is a community effort and would not continue to develop, improve and grow without you. We plan to have a tutorial at the LREC-COLING 2024 conference and a meeting at SEMANTiCS, Sep 17-19, 2024, conference in Amsterdam, Netherlands.

Stay safe and check Twitter, Instagram and LinkedIn or or subscribe to our Newsletter for the latest news and information.

Yours,

Julia & Maria

on behalf of the DBpedia Association

The post A year with DBpedia – Retrospective Part 2/2023 appeared first on DBpedia Association.

]]>
Recap 2023: A Year with DBpedia https://www.dbpedia.org/blog/recap-2023-a-year-with-dbpedia/ Mon, 04 Dec 2023 11:50:24 +0000 https://www.dbpedia.org/?p=5663 Can you believe it..? … sixteen years ago the first DBpedia dataset was released. Sixteen years of development, improvements and growth. Now more than 4,100 GByte of data is hosted on the DBpedia Databus. We want to take this as an opportunity to send out a big “Thank you!” to all contributors, developers, members, hosters, […]

The post Recap 2023: A Year with DBpedia appeared first on DBpedia Association.

]]>

Can you believe it..? … sixteen years ago the first DBpedia dataset was released. Sixteen years of development, improvements and growth. Now more than 4,100 GByte of data is hosted on the DBpedia Databus. We want to take this as an opportunity to send out a big “Thank you!” to all contributors, developers, members, hosters, funders, believers and DBpedia enthusiasts who made that possible. Thank you for your support!

In the upcoming blog series, we will take you on a retrospective tour through 2023. Furthermore, we will give you insights into a year with DBpedia. In the following we will also highlight our past events. 

Snapshot Release

We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset. In 2023, we released version 2022-12 release with all the features since version 2022-09. The current Snapshot Release contains more than 850 million facts (triples). Please check more details on our website.

Google Summer of Code (GSoC)

For the 12th year in a row, we have been able to support and guide young, ambitious developers who have joined us as an open source organization. We encouraged them to work on a programming project this summer. Each year we have been inspired by new project ideas, many amazing contributors, and mostly great project results that have shaped the future of DBpedia. If you want to have deeper insights in our GSoC contributors work you can find their blogs and repos on the DBpedia blog.

DBpedia @ Leipzig Semantic Web Day

On June 28, 2023, Sebastian Hellmann presented the DBpedia Databus 2.1. at Data Week Leipzig. Data Week is the networking and exchange event for highlighting scientific, economic, and social perspectives of data and its use, where industry, citizens, science, and public authorities can enter into dialogue. Data Week Leipzig took place June 26-30, 2023. Please find Sebastian’s slides here.

In the upcoming blog post after the holidays we will give you more insights in the past events and technical achievements. We are now looking forward to the year 2024. The DBpedia team plans to have a tutorial at the LREC-COLING 2024 conference and the DBpedia Day at SEMANTiCS 2024 conference in Amsterdam, Netherlands. 

Above all, we wish you a merry Christmas and a happy new year. In the meantime, stay tuned and check our Twitter, Instagram or LinkedIn channels. You can subscribe to our Newsletter for the latest news and information around DBpedia.

Julia & Maria,   

on behalf of the DBpedia Association

The post Recap 2023: A Year with DBpedia appeared first on DBpedia Association.

]]>
Retrospective: Google Summer of Code 2023 https://www.dbpedia.org/blog/retrospective-google-summer-of-code-2023/ Wed, 01 Nov 2023 09:02:40 +0000 https://www.dbpedia.org/?p=5648 We received 27 project proposals for this GSoC edition. For the 12th year in a row, we have been able to support and guide young, ambitious developers who joined us as an open source organization to work on a programming project over this summer. This year we were once again inspired by new project ideas, […]

The post Retrospective: Google Summer of Code 2023 appeared first on DBpedia Association.

]]>
We received 27 project proposals for this GSoC edition.

For the 12th year in a row, we have been able to support and guide young, ambitious developers who joined us as an open source organization to work on a programming project over this summer. This year we were once again inspired by new project ideas, great results and dedicated students. From the numerous and wonderful project applications we received, we were able to select six proposals to take part in the GSoC with their project idea. Every year, Google Summer of Code offers a great opportunity to work on projects remotely while getting a deep insight into open source projects like ours – DBpedia.

Meet our Google Summer of Code 2023 contributors and their projects

During our summer program, our six finalists worked intensely on their DBpedia projects and achieved great results to show to the public. Topics in the projects included machine learning and natural language processing, extraction frameworks and chatbot development. If you want to have deeper insights into our GSoC contributer’s work you can find their blogs and repos in the following list. Check them out! 

Thanks to mentors

Thanks to all our mentors around the world for joining our project and supporting us with their expertise and kindness. Above all, a big thank you to those who have supported us for many years in a row. Thank you all again for spending over 3.5+ months working with this year’s GSoC contributors and helping them become better open source contributors!

Mentor Summit

After the last GSoC Mentor Summit took place online and was therefore open to all organisation admins and mentors, this year a mentor was again selected to attend the Mentor Summit 2023, which took place from Friday 13th to Sunday 15th October at the TETRA & Marriott Hotel Sunnyvale, California and was attended by @DiegoMoussallem.

After GSoC is before the next GSoC

We can not wait for the 2024 edition. Likewise, if you are an ambitious contributor who is interested in open source development and working with DBpedia you are more than welcome to either contribute your own project idea or apply for project ideas we offer starting in early 2024. If you would like to know where previous mentors and contributors are now working, please read our GSoC blog post about the last 10 years of DBpedia at GSoC. 

In case you like to mentor a project do not hesitate to also get in touch with us via dbpedia@infai.org

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Maria & Julia

on behalf of the DBpedia Association

The post Retrospective: Google Summer of Code 2023 appeared first on DBpedia Association.

]]>
DBpedia Day in Leipzig @ SEMANTiCS 2023 https://www.dbpedia.org/blog/dbpedia-day-in-leipzig-semantics-2023/ Tue, 10 Oct 2023 09:09:29 +0000 https://www.dbpedia.org/?p=5641 Up to 120 DBpedians joined the DBpedia Day on September 20, 2023, in Leipzig, Germany. This year’s meeting was again co-located with the SEMANTiCS conference. 

The post DBpedia Day in Leipzig @ SEMANTiCS 2023 appeared first on DBpedia Association.

]]>
Wow! Up to 120 DBpedians joined the DBpedia Day on September 20, 2023, in Leipzig, Germany. This year’s meeting was again co-located with the SEMANTiCS conference. 

First and foremost, we would like to thank the Institute for Applied Informatics for supporting our community and many thanks to the SEMANTiCS organization team for hosting this year’s community meeting. 

Opening of the DBpedia Day

Also this year, our CEO Sebastian Hellmann opened the community meeting by presenting the Databus 2.1.0 project (slides). Afterwards, Edward Curry from the University of Galway gave his fantastic keynote presentation “Towards Foundation Models for Data Spaces”. You can read his abstract here.

Member Presentation Session

Milan Dojchinovski, InfAI/DBpedia Association and CTU Prague, started the member presentation session with a short welcome. The first speaker was Angel Moreno, GNOSS, with his presentation “NEURALIA Rioja: the unified Knowledge Graph of La Rioja Government which integrates twenty six sources of information in a single access point” (slides). Shortly after, Enno Meijers, KB, talked about “Network-of-Terms, bringing links to your data” (slides). Next, Sarah Binta Alam Shoilee, Network Institute & Vrije Universiteit Amsterdam talked about ”Cultural AI Lab”(slides). This was followed by the presentation “Linking and Consumption of DBpedia in TriplyDB” by Kathrin Dentler & Wouter Beek, TriplyDB (slides). Then Sebastian Gabler, SWC, talked about “Using Dewey Decimal Classification for linked data” (slides). Finally, the last talk of this session was given by  Sebastian Tramp, eccenca, with “Using DBpedia Services with eccenca Corporate Memory and eccenca.my”.

For further details of the presentations follow the links to the slides. 

  • “NEURALIA Rioja: the unified Knowledge Graph of La Rioja Government which integrates twenty six sources of information in a single access point” by Angel Moreno, GNOSS (slides)
  • “Network-of-Terms, bringing links to your data” by Enno Meijers, KB (slides)
  • ”Cultural AI Lab” by Sarah Binta Alam Shoilee, Network Institute & Vrije Universiteit Amsterdam (slides)
  • “Linking and Consumption of DBpedia in TriplyDB” by Kathrin Dentler & Wouter Beek, TriplyDB (slides)
  • “Using Dewey Decimal Classification for linked data” by Sebastian Gabler, SWC (slides)
  • “Using DBpedia Services with eccenca Corporate Memory and eccenca.my” by Sebastian Tramp, eccenca (slides)

DBpedia Science: Linking and Consumption

This session was dedicated to the most recent research on linking and consumption of the DBpedia Knowledge Graph and beyond. Novel methods, tools and challenges around linking and consumption of knowledge graphs were presented and discussed. Milan Dojchinovski, InfAI/DBpedia Association and CTU Prague, chaired this session with five talks. Hereafter you will find the presentations given during this session:

  • “Open Research Knowledge Graph” by Sören Auer, TIB
  • “Blocking Methods for Entity Resolution on Knowledge Graphs” by Daniel Obraczka, Data Science Center ScaDS.AI Dresden/Leipzig (slides)
  • “Validating SHACL Constraints with Reasoning: Lessons Learned from DBpedia” by Maribel Acosta, TUM School of Computation, Information and Technology
  • “Exploiting Semi-Structured Information in Wikipedia for Knowledge Graph Construction” by Nicolas Heist, Data and Web Science Group, University of Mannheim (slides)
  • “Using Pre-trained Language Models for Abstractive DBpedia Summarization” by Hamada Zahera, Data Science Group, Paderborn University (slides)

DBpedia Community session

Sebastian Hellmann, InfAI/DBpedia Association, hosted this year’s community session. DBpedia has had a major impact on data landscape during our 15-year journey. This session discussed the progress of the vision of a “Global and Unified Access to Knowledge Graphs”, which paved the way for an international FAIR Open Data Space driven by knowledge graphs. The session focused on the potential of large-scale knowledge graphs to reshape the open data domain. Topics included how the DBpedia community can pool its data, tools and know-how more effectively, and how we can make these assets more findable, accessible and interoperable. The session provided an insightful discourse on the future of open data and how we can forge strategic alliances across diverse industrial sectors.

Following, you find the presentations of this session: 

  • “Update Japanese DBpedia” Hideaki Takeda, LODI (slides)
  • Several impulses about different topics and follow-up discussion, moderated by Sebastian Hellmann, InfAI/DBpedia Association (discussion document)

In case you missed the event, all slides are also available on our event page. Further insights, feedback and photos about the event are available on Twitter via #DBpediaDay

We are now looking forward to more DBpedia events in the upcoming months and at next year’s SEMANTiCS Conference, which will be held in Amsterdam, Netherlands.  

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Maria & Julia

on behalf of the DBpedia Association

The post DBpedia Day in Leipzig @ SEMANTiCS 2023 appeared first on DBpedia Association.

]]>
Retrospective 2023 – Half a year with DBpedia https://www.dbpedia.org/blog/retrospective-2023-half-a-year-with-dbpedia/ Tue, 04 Jul 2023 11:10:25 +0000 https://www.dbpedia.org/?p=5625 Already, half of the year 2023 has passed by. Time for us to look back on the past half year. What have we achieved? What still lies ahead of us? In the following, we will take you on a retrospective tour through the first half of 2023. We will highlight our past events and the […]

The post Retrospective 2023 – Half a year with DBpedia appeared first on DBpedia Association.

]]>
Already, half of the year 2023 has passed by. Time for us to look back on the past half year. What have we achieved? What still lies ahead of us? In the following, we will take you on a retrospective tour through the first half of 2023. We will highlight our past events and the development around the DBpedia dataset. Have fun reading!

DBpedia is part of the Google Summer of Code project 2023

So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Like every year, we received many fantastic applications this year. Out of these applications 6 great projects from contributors all over the world were selected to work together with our mentors. Right now the contributors are in the middle of the coding phase. If you want to know more about this year’s projects go and have a look at the DBpedia blog.

DBpedia Snapshot 2022-12 Release

We are pleased to announce immediate availability of a new edition of the free and publicly accessible Sparql Query Service Endpoint and Linked Data pages, for interacting with the new Snapshot Dataset. Check our blog!  

Leipzig Semantic Web Day

On June 28, 2023, Sebastian Hellmann presented the DBpedia Databus 2.1. at Data Week Leipzig. Data Week is the networking and exchange event for highlighting scientific, economic, and social perspectives of data and its use, where industry, citizens, science, and public authorities can enter into dialogue. Data Week Leipzig took place June 26-30, 2023. Please find Sebastian’s slides here

What Will the Future Bring?

We are now looking forward to the LDK conference, which will take place September 12-15, 2023, in Vienna, Austria. Will will organize a tutorial on September 13, 2023. If you would like to join, please check more details on our event page. After that, we’ll fly straight back to Leipzig, because the Semantics Conference will be held at the Hyperion Hotel Leipzig from September 20 to 22, 2023. At the beginning of the conference, we will host the DBpedia Day on September 20, 2023.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our newsletter for the latest news and information around DBpedia.

Julia

on behalf of the DBpedia Association

The post Retrospective 2023 – Half a year with DBpedia appeared first on DBpedia Association.

]]>
DBpedia – GSoC Bonding Period 2023 https://www.dbpedia.org/blog/dbpedia-gsoc-bondning-period-2023/ Thu, 04 May 2023 07:28:59 +0000 https://www.dbpedia.org/?p=5613 Great job! You have been chosen as one of our GSoC students for the summer of 2023, where you will be working together with DBpedia. We would like to introduce you to the DBpedia community, developers, and your mentors so that you can establish contact with them. Keep reading to find out more! Student Projects […]

The post DBpedia – GSoC Bonding Period 2023 appeared first on DBpedia Association.

]]>
Great job! You have been chosen as one of our GSoC students for the summer of 2023, where you will be working together with DBpedia. We would like to introduce you to the DBpedia community, developers, and your mentors so that you can establish contact with them. Keep reading to find out more!

Student Projects Announced

Today Google finally announced who is selected as a GSoC student for this year. Accepted students are therefore now paired with a mentor and start planning their projects and milestones. 

GSoC Community Bonding

As the Community Bonding is starting now from May, 4 until May, 28 it is now the time to spend a month learning more about DBpedia and it’s community before coding starts on May, 29. To get in touch with your mentores and everyone else from the DBpedia Community, you have plenty of options:

  • First of all, you can chat with other DBpedians on Slack, where you are able to join DBpedia developers discussion and technical discussions. 
  • To increase your visibility in the DBpedia Community, try to answer some questions in the DBpedia forum (especially in the unanswered & support category) and browse the topics. 
  • Last but not least, check out our Github repository for open issues and see if you can help to solve them (e.g issues regarding the extraction framework or mappings).

When you share something about your project on your own blog or github, please inform us and your mentors. Thus, we can share it with the community and show your working progress as well as your results.

In case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Check Twitter or LinkedIn and feel free to subscribe to our newsletter for the latest news and information around DBpedia.

We wish you all the best!

Emma

on behalf of the DBpedia Association

The post DBpedia – GSoC Bonding Period 2023 appeared first on DBpedia Association.

]]>
DBpedia Snapshot 2022-12 Release https://www.dbpedia.org/blog/dbpedia-snapshot-2022-12-release/ Mon, 27 Mar 2023 09:36:32 +0000 https://www.dbpedia.org/?p=5585 We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset.  News since DBpedia Snapshot 2022-09 Work in progress: Smoothing the community issue reporting and fixing at Github What is the “DBpedia Snapshot” Release? […]

The post DBpedia Snapshot 2022-12 Release appeared first on DBpedia Association.

]]>
We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset. 

News since DBpedia Snapshot 2022-09

  • New Abstract Extractor due to GSOC 2022 (credits to Celian Ringwald) 

Work in progress: Smoothing the community issue reporting and fixing at Github

What is the “DBpedia Snapshot” Release?

Historically, this release has been associated with many names: “DBpedia Core”, “EN DBpedia”, and — most confusingly — just “DBpedia”. In fact, it is a combination of —

  • EN Wikipedia data — A small, but very useful, subset (~ 1 Billion triples or 14%) of the whole DBpedia extraction using the DBpedia Information Extraction Framework (DIEF), comprising structured information extracted from the English Wikipedia plus some enrichments from other Wikipedia language editions, notably multilingual abstracts in ar, ca, cs, de, el, eo, es, eu, fr, ga, id, it, ja, ko, nl, pl, pt, sv, uk, ru, zh.
  • Links — 62 million community-contributed cross-references and owl:sameAs links to other linked data sets on the Linked Open Data (LOD) Cloud that allow to effectively find and retrieve further information from the largest,  decentral, change-sensitive knowledge graph on earth that has formed around DBpedia since 2007. 
  • Community extensions — Community-contributed extensions such as additional ontologies and taxonomies. 

Release Frequency & Schedule

Going forward, releases will be scheduled for the 1th of February, May, August, and November (with +/- 5 days tolerance), and are named using the same date convention as the Wikipedia Dumps that served as the basis for the release. An example of the release timeline is shown below: 

December 6–8 December 8–20Dec 20–Jan 1Jan 1–Feb 15
Wikipedia dumps for June 1 become available on https://dumps.wikimedia.org/Download and extraction with DIEFPost-processing and quality-control periodLinked Data and SPARQL endpoint deployment 

Data Freshness

Given the timeline above, the EN Wikipedia data of DBpedia Snapshot has a lag of 1-4 months. We recommend the following strategies to mitigate this:

  1. DBpedia Snapshot as a kernel for Linked Data: Following the Linked Data paradigm, we recommend using the Linked Data links to other knowledge graphs to retrieve high-quality and recent information. DBpedia’s network consists of the best knowledge engineers in the world, working together, using linked data principles to build a high-quality, open, decentralized knowledge graph network around DBpedia. Freshness and change-sensitivity are two of the greatest data-related challenges of our time, and can only be overcome by linking data across data sources. The “Big Data” approach of copying data into a central warehouse is inevitably challenged by issues such as co-evolution and scalability. 
  2. DBpedia Live: Wikipedia is unmistakenly the richest, most recent body of human knowledge and source of news in the world. DBpedia Live is just minutes behind edits on Wikipedia,  which means that as soon as any of the 120k Wikipedia editors press the “save” button, DBpedia Live will extract fresh data and update.  DBpedia Live consists of the DBpedia Live Sync API (for syncing into any kind of on-site databases), Linked Data and SPARQL endpoint.
  3. Latest-Core is a dynamically updating Databus Collection. Our automated extraction robot “MARVIN” publishes monthly dev versions of the full extraction, which are then refined and enriched to become Snapshot.      

Data Quality & Richness

We would like to acknowledge the excellent work of Wikipedia editors (~46k active editors for EN Wikipedia), who are ultimately responsible for collecting information in Wikipedia’s infoboxes, which are refined by DBpedia’s extraction into our knowledge graphs. Wikipedia’s infoboxes are steadily growing each month and according to our measurements grow by 150% every three years. EN Wikipedia’s inboxes even doubled in this timeframe. This richness of knowledge drives the DBpedia Snapshot knowledge graph and is further potentiated by synergies with linked data cross-references. Statistics are given below

Data Access & Interaction Options

Linked Data

Linked Data is a principled approach to publishing RDF data on the Web that enables interlinking data between different data sources, courtesy of the built-in power of Hyperlinks as unique Entity Identifiers.


HTML pages comprising Hyperlinks that confirm to Linked Data Principles is one of the methods of interacting with data provided by the DBpedia Snapshot, be it manually via the web browser or programmatically using REST interaction patterns via https://dbpedia.org/resource/{entity-label} pattern. Naturally, we encourage Linked Data interactions, while also expecting user-agents to honor the cache-control HTTP response header for massive crawl operations. Instructions for accessing Linked Data, available in 10 formats.

SPARQL Endpoint

This service enables some astonishing queries against Knowledge Graphs derived from Wikipedia content. The Query Services Endpoint that makes this possible is identified by http://dbpedia.org/sparql, and it currently handles 7.2 million queries daily on averageSee powerful queries and instructions (incl. rates and limitations).

An effective Usage Pattern is to filter a relevant subset of entity descriptions for your use case via SPARQL and then combine with the power of Linked Data by looking up (or de-referencing) data via owl:sameAs property links en route to retrieving specific and recent data from across other Knowledge Graphs across the massive Linked Open Data Cloud.

Additionally, DBpedia Snapshot dumps and additional data from the complete collection of datasets derived from Wikipedia are provided by the DBpedia Databus for use in your own SPARQL-accessible Knowledge Graphs.

DBpedia Ontology

This Snapshot Release was built with DBpedia Ontology (DBO) version: https://databus.dbpedia.org/ontologies/dbpedia.org/ontology–DEV/2021.11.08-124002 We thank all DBpedians for the contribution to the ontology and the mappings. See documentation and visualizations, class tree and properties, wiki.

DBpedia Snapshot Statistics

Overview. Overall the current Snapshot Release contains more than 850 million facts (triples).

At its core, the DBpedia ontology is the heart of DBpedia. Our community is continuously contributing to the DBpedia ontology schema and the DBpedia infobox-to-ontology mappings by actively using the DBpedia Mappings Wiki.

The current Snapshot Release utilizes a total of 55 thousand properties, whereas 1377 of these are defined by the DBpedia ontology.

Classes. Knowledge in Wikipedia is constantly growing at a rapid pace. We use the DBpedia Ontology Classes to measure the growth: Total number in this release (in brackets we give: a) growth to the previous release, which can be negative temporarily and b) growth compared to Snapshot 2016-10): 

  • Persons: 1792308 (1.01%, 1.13%)
  • Places: 748372 (1.00%, 1820.86%), including but not limited to 590481 (1.00%, 5518.51%) populated places
  • Works 610589 (1.00%, 619.89%), including, but not limited to
    • 157566 (1.00%, 1.38%) music albums
    • 144415 (1.01%, 15.94%) films
    • 24829 (1.01%, 12.53%) video games
  • Organizations: 345523 (1.01%, 109.31%), including but not limited to
    • 87621 (1.01%, 2.25%) companies
    • 64507 (1.00%, 64507.00%) educational institutions
  • Species: 1933436 (1.01%, 322239.33%)
  • Plants: 7718 (0.82%, 1.71%)
  • Diseases: 10591 (1.00%, 8.54%)

Detailed Growth of Classes: The image below shows the detailed growth for one class. Click on the links for other classes: Place, PopulatedPlace, Work, Album, Film, VideoGame, Organisation, Company, EducationalInstitution, Species, Plant, Disease. For further classes adapt the query by replacing the <http://dbpedia.org/ontology/CLASS> URI. Note, that 2018 was a development phase with some failed extractions. The stats were generated with the Databus VOID Mod.

Links. Linked Data cross-references between decentral datasets are the foundation and access point to the Linked Data Web. The latest Snapshot Release provides over 130.6 million links from 7.62 million entities to 179 external sources.

Top 11

###TOP11###

33,975305 http://www.wikidata.org 

  7,206,254 https://global.dbpedia.org 

  4,308,772 http://yago-knowledge.org 

  3,855,108 http://de.dbpedia.org 

  3,731,002 http://fr.dbpedia.org 

  2,991,921 http://viaf.org 

  2,929,808 http://it.dbpedia.org 

  2,925,530 http://es.dbpedia.org 

  2,788,703 http://fa.dbpedia.org 

  2,587,004 http://ru.dbpedia.org 

  2,580,398 http://sr.dbpedia.org 

Top 10 without DBpedia namespaces

###TOP10###

33,975,305 http://www.wikidata.org 

  4,308,772 http://yago-knowledge.org 

  2,991,921 http://viaf.org

  1,708,533 http://d-nb.info 

     612,227 http://sws.geonames.org 

     596,134 http://umbel.org 

     537,602 http://data.bibliotheken.nl 

     430,839 http://www.w3.org 

     422,989 http://musicbrainz.org 

     104,433 http://linkedgeodata.org 

DBpedia Extraction Dumps on the Databus

All extracted files are reachable via the DBpedia account on the Databus. The Databus has two main structures:

Snapshot Download. For downloading DBpedia Snapshot, we prepared this collection, which also includes detailed releases notes: 

https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-03

The collection is roughly equivalent to http://downloads.dbpedia.org/2016-10/core/

Collections can be downloaded in many different ways, some download modalities such as bash script, SPARQL, and plain URL list are found in the tabs at the collection. Files are provided as bzip2 compressed n-triples files. In case you need a different format or compression, you can also use the “Download-As” function of the Databus Client (GitHub), e.g. -s $collection -c gzip would download the collection and convert it to GZIP during download. 

Replicating DBpedia Snapshot on your server can be done via Docker, see https://hub.docker.com/r/dbpedia/virtuoso-sparql-endpoint-quickstart 

git clone https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart.git

cd virtuoso-sparql-endpoint-quickstart

COLLECTION_URI=https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-09 VIRTUOSO_ADMIN_PASSWD=password docker-compose up

Download files from the whole DBpedia extraction. The whole extraction consists of approx. 20 Billion triples and 5000 files created from 140 languages of Wikipedia, Commons  and Wikidata. They can be found in https://databus.dbpedia.org/dbpedia/(generic|mappings|text|wikidata

You can copy-edit a collection and create your own customized (e.g.) collections via “Actions” -> “Copy Edit” , e.g. you can Copy Edit the snapshot collection above, remove some files that you do not need and add files from other languages. Please see the Rhizomer use case: Best way to download specific parts of DBpedia. Of course, this only refers to the archived dumps on the Databus for users who want to bulk download and deploy into their own infrastructure. Linked Data and SPARQL allow for filtering the content using a small data pattern.  

Acknowledgments

First and foremost, we would like to thank our open community of knowledge engineers for finding & fixing bugs and for supporting us by writing data tests. We would also like to acknowledge the DBpedia Association members for constantly innovating the areas of knowledge graphs and linked data and pushing the DBpedia initiative with their know-how and advice. OpenLink Software supports DBpedia by hosting SPARQL and Linked Data; University Mannheim, the German National Library of Science and Technology (TIB) and the Computer Center of University Leipzig provide persistent backups and servers for extracting data. We thank Marvin Hofer and Mykola Medynskyi for technical preparation. This work was partially supported by grants from the Federal Ministry for Economics and Climate Action (BMWK) for the LOD-GEOSS Project (03EI1005E), PenFLaaS (100594042) as well as for the PLASS Project (01MD19003D).

The post DBpedia Snapshot 2022-12 Release appeared first on DBpedia Association.

]]>
GSoC2023 – Call for Contributors https://www.dbpedia.org/blog/gsoc2023-call-for-contributors/ Thu, 02 Mar 2023 10:02:40 +0000 https://www.dbpedia.org/?p=5577 Are you a student looking for a summer experience that combines coding skills with open source development? Then look no further than the Google Summer of Code program 2023, where you can join forces with DBpedia to help advance the state of the art in semantic web technologies. Build your skills and gain valuable experience […]

The post GSoC2023 – Call for Contributors appeared first on DBpedia Association.

]]>
Are you a student looking for a summer experience that combines coding skills with open source development? Then look no further than the Google Summer of Code program 2023, where you can join forces with DBpedia to help advance the state of the art in semantic web technologies. Build your skills and gain valuable experience while making a real impact on the tech community!

For the 12th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to work with open-source organizations like DBpedia. So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Even though Covid-19 changed a lot in the world, it couldn’t shake Google Summer of Code (GSoC) much. The program, designed to mentor youngsters from afar is almost too perfect for us. One of the advantages of GSoC is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us.

DBpedia is now looking for contributors who want to work with us during the upcoming summer months.  

What is Google Summer of Code?

Google Summer of Code is a global program focused on bringing developers into open source software development. Funds will be given to all new beginner contributors to open source over 18 years to work for two and a half months (or longer) on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2023.

And this is how it works …

Step 1Check out one of our projects here or draft your own. 
Step 2Get in touch with our mentors as soon as possible and write up a project proposal of at least 8 pages. Information about our proposal structure and a template are available here.  
Step 3After a selection phase, contributors are matched with a specific project and mentor(s) and start working on the project. 

Application Procedure GSoC2023

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2023. Please also note the official GSoC 2023 timeline for your proposal submission and make sure to submit on time. Unfortunately, extensions cannot be granted. Final submission deadline is April 4, 2023 at 18:00 UTC.

Contact

Detailed information on how to apply are available on the DBpedia website. We’ve prepared an information kit for you. Please find all necessary information regarding the student application procedure here.

And in case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Finally, we are looking forward to your contribution!

Yours DBpedia Association

The post GSoC2023 – Call for Contributors appeared first on DBpedia Association.

]]>
LOD activities at the National Archives of the Netherlands https://www.dbpedia.org/blog/lod-activities-at-the-national-archives-of-the-netherlands/ Tue, 14 Feb 2023 10:56:31 +0000 https://www.dbpedia.org/?p=5569 By Ed de Heer About the National Archives This article describes the Linked Open Data (LOD) activities of the National Archives of the Netherlands and is based on my presentation at Semantics 2022 in Vienna. At the National Archives people find information about their lives, Dutch (political/administrative) history and society. Our mission is: “we serve […]

The post <strong>LOD activities at the National Archives of the Netherlands</strong> appeared first on DBpedia Association.

]]>
By Ed de Heer

About the National Archives

This article describes the Linked Open Data (LOD) activities of the National Archives of the Netherlands and is based on my presentation at Semantics 2022 in Vienna.

At the National Archives people find information about their lives, Dutch (political/administrative) history and society. Our mission is: “we serve every person’s right to information, and we offer insight into the history of our country.”  The National Archives believes in the power of open data. We want to offer open data as much as possible. Not only to the government and historians but also to third parties which develop new applications and websites. In this way the general public can participate, and new ways of disclosing heritage information can arise. We publish our data (archives, indexes, and photographs) with a CC0 license by, csv, XML and API’s. Below an overview of our overall collection and services.

Linked open data

We are working on the development of Linked Open Data since 2018. Then we started our first LOD experiments and bought an ETL tool to transform our data to RDF. In 2019 we developed an URI strategy and started to model the indexes. We have indexes about enslaved people and slavery, fish rights, emigrants and finance, etc. So we had to develop all kinds of LOD models and use different ontologies. Now we have just finished the publication of our 400,000 digitized pictures with a CC0 license as RDF through our  SPARQL endpoint (Beta). https://www.nationaalarchief.nl/onderzoeken/linked-open-data/sparql-interface

Challenges of linked open data

When transforming to RDF we faced some challenges. For instance the challenge of data quality. We don’t improve the quality of our data. When we want to curate our data we would have to check the original archives. This would take a lot of effort. And what is right or wrong? When a particular archive speaks of “Amsteredam” instead of “Amsterdam”, the record states “Amsteredam” and not Amsterdam, because that is the original spelling  in the archive. Also, within an organization as the National Archives, a lot of stakeholders are involved. IT, Collection, Services, and management. It takes a lot of time and effort to get all the priorities straight.

The Verkaufsbücher

One of our most successful LOD projects is the Verkaufsbücher. This is an administration of the Nazis during World War II in which they wrote the expropriation of Jewish properties in the Netherlands. These houses were “bought” from the Jewish people far under the real price and the owners were often deported shortly afterwards. The National Archives wanted to visualize this story and this data. We worked with the Offices of the land registry of the Netherlands (Kadaster). And developed a data story https://labs.kadaster.nl/stories/verkaufsbucher/index.html. This data story was noticed by a Dutch broadcasting company and issued an item on national television. This broadcast triggered a lot of exposure and the attention of Dutch government agencies and municipalities. Due to this story, local governments have started to investigate what happened with these properties during and directly after the war and some municipalities are going to compensate the victims or their next of kin.

Digital Heritage Network and the Dataset Register

All these LOD developments don’t thrive on their own. Working together with other institutions and professionals is vital. The Dutch Digital Heritage Network is a partnership of cultural heritage agencies in the Netherlands. It focuses on developing a system of national facilities and services for improving the visibility, usability, and sustainability of digital heritage based on linked open data. The network is open to all Dutch institutions and organizations in the digital heritage field.

The Dataset Register is an initiative of the Digital Heritage Network. The National Archives hosts the Dataset Register. This register provides insight into the availability of data sets in the heritage field and thus stimulates the use of these datasets. The Dataset Register makes it easier to publish information about heritage datasets. By analyzing the datasets we can build a knowledge graph on heritage data for better use and the Dataset Register can help software (Google) to find collections.

Heritage institutions are encouraged to make their data sets available, to describe these data sets and to publish them online. Also to submit the URLs of dataset descriptions to the Dataset Register. The Dataset Register retrieves the dataset descriptions, creating an overall picture of what is available. See also https://datasetregister.netwerkdigitaalerfgoed.nl/?lang=en

Drs. Ed de Heer MIM is advisor and project manager for Linked and Open Data at the National Archives and administrator for the Dataset Register. 

The post <strong>LOD activities at the National Archives of the Netherlands</strong> appeared first on DBpedia Association.

]]>