We're a student-faculty group that aims to increase transparency and accelerate collaboration using publicly available UM6P datasets.

Explore our Datasets

Discover a wealth of information in our comprehensive data catalog. Browse datasets across various disciplines, including Science & Technology, Humanities, and Business. Our data catalog is designed to support research, learning, and innovation.


Articles Featuring our Datasets

An open access NLP dataset for Arabic dialects

Natural Language Processing (NLP) is today a very active field of research and innovation. Many applications need however big sets of data for supervised learning, suitably labelled for the training purpose. This includes applications for the Arabic language and its national dialects. However, such open access labeled data sets in Arabic and its dialects are lacking in the Data Science ecosystem and this lack can be a burden to innovation and research in this field. In this work, we present an open data set of social data content in several Arabic dialects. This data was collected from the Twitter social network and consists on +50K twits in five (5) national dialects. Furthermore, this data was labeled for several applications, namely dialect detection, topic detection and sentiment analysis. We publish this data as an open access data to encourage innovation and encourage other works in the field of NLP for Arabic dialects and social media. A selection of models were built using this data set and are presented in this paper along with their performances.

Open data for Moroccan license plates for OCR applications

Significant number of researches have been developed recently around intelligent system for traffic management, especially, OCR based license plate recognition, as it is considered as a main step for any automatic traffic management system. Good quality data sets are increasingly needed and produced by the research community to improve the performance of those algorithms. Furthermore, a special need of data is noted for countries having special characters on their licence plates, like Morocco, where Arabic Alphabet is used. In this work, we present a labeled open data set of circulation plates taken in Morocco, for different type of vehicles, namely cars, trucks and motorcycles. This data was collected manually and consists of 705 unique and different images. Furthermore this data was labeled for plate segmentation and for matriculation number OCR. Also, As we show in this paper, the data can be enriched using data augmentation techniques to create training sets with few thousands of images for different machine leaning and AI applications. We present and compare a set of models built on this data. Also, we publish this data as an open access data to encourage innovation and applications in the field of OCR and image processing for traffic control and other applications for transportation and heterogeneous vehicle management.


Featured Datasets

Social Media Posts in Arabic Dialect

This dataset contains a labeled collection of approximately 50,000 social media posts in various Arabic dialects. Each post has been manually annotated with sentiment labels, providing a rich resource for natural language processing and sentiment analysis research.

Moroccan Vehicle Plates OCR

This dataset contains a labeled collection of approximately 800 unique pictures of Moroccan vehicle plates. The dataset is designed to facilitate research and development in Optical Character Recognition (OCR) applications specific to Moroccan license plates.


Join the Movement TODAY!

Our datasets are collected through crowdsourcing, relying on the generous contributions of volunteers to share, collect, manage, and publish this valuable data. We are constantly seeking enthusiastic volunteers to join our efforts and help us expand our collection of datasets. If you are interested in contributing, please contact us via this forms.


Interested in Joining the Movement? Check out the Dataset Contribution Form or our Volunteer Application Form.
© UM6P Vision 2030.

About UM6P

At University Mohammed VI Polytechnic (UM6P), we dedicate our efforts to becoming a leading institution of higher education, research, and innovation nationally and transnationally. As an institution engaged in the sustainable development of the continent, UM6P combines theoretical knowledge with practical application to prepare apprentices to address critical challenges.We cover a wide range of disciplines, from Science & Technology to Health, Humanities, and Business & Management, to equip our community with the perfect tools to contribute positively to the development of our society. UM6P’s approach is both Moroccan and African, designed to make our continent great again by taking advantage of opportunities to achieve socioeconomic growth.


Story Behind Open Data Portal

Story of UM6P Open Data Portal

The UM6P Open Data Portal was founded by the Vision 2030 team to harness the power of open data for the betterment of our university and society. We launched the UM6P Open Data Portal to provide accessible and comprehensive data resources, fostering a culture of openness and collaboration. Our platform aligns with UM6P’s Vision 2030, which seeks to foster excellence, innovation, and collective intelligence within our community.

Our Core Values

  • Transparency Ensuring openness and accountability in all our actions.

  • Innovation: Encouraging creative solutions and forward-thinking approaches.

  • Community: Building a diverse and inclusive environment where every voice matters.

  • Add Guiding principles: Shared, governed, and applied


Our Work

Data Catalog

A comprehensive repository of publicly accessible UM6P datasets, enabling students, faculty, and researchers to explore and utilize data for various academic and research purposes.

Documentation

A collection of guides and tutorials on data science, web development, and other relevant skills, created by our community to support learning and collaboration.

Data Projects

In-depth analyses and interactive projects that showcase the application of data science and data visualization techniques on diverse topics related to university life and societal challenges.

Datathon

A competitive event wiil be held each semester where participants can collaborate on data-driven projects, attend workshops, and enhance their skills while enjoying a vibrant community atmosphere.


Join us Today to Accelerate the UM6P Open Data Initiative


Interested in Joining the Movement? Check out the Dataset Contribution Form or our Volunteer Application Form.
© UM6P Vision 2030.