Vincent Oria


Contact Info






GITC 4317D




Computer Science

Web page 


Academic Interests: Multimedia Databases, Spatio-Temporal Databases and Recommender Systems.






Current Research Projects

·         Multi-instrument database of solar flares

(Research conducted in collaboration with Alexander Kosovichev (NJIT Physics department), Gelu Nita (NJIT Physics department, PI), )

In this research we are interested in active caching mechanisms and multi-attribute recommendations:

Solar flares represent the most prominent manifestation of the Sun's magnetic activity. The flare emissions cover the whole range of the electromagnetic spectrum, from radio waves to gamma rays. Solar flares are often accompanied by high-speed coronal mass ejections and high-energy particles, which play critical role in the Earth's space environment and space weather. Without any doubt solar flares are the important and enigmatic phenomenon on the Sun affecting the whole heliosphere and space weather, thus they are in the focus of research of the Heliophysics Community. Investigating the physics underlying solar flare phenomena is a high-priority science objective addressing the goals of NASA's Heliophysics Research Program, specifically, NASA’s Strategic Goal 2.2, “to study the Sun and its interactions with the Earth and the Solar System.” Currently, solar data are collected in databases of individual instrument teams and several web or software based data portals are provided to get access to such data, either directly from instrument websites, or through central archives such as the Solar Data Analysis Center (SDAC), the Space Physics Data Facility (SPDF), or the National Space Science Data Center (NSSDC). Several virtual observatories (VxOs) also provide access to solar data from many space or ground-based observatories, as well as the necessary tools for cross-mission analysis and visualization. While the interactive access through web forms or stand-alone applications offered by most of such data portals may satisfactory serve the immediate scope of finding and retrieving data associated with a particular solar event, it is extremely difficult to automatically search for solar flares according to a user-defined set of common physical properties and to collect these data and prepare for multi-instrument"/"multi-wavelength analysis. Because of these difficulties most of the current investigations are limited to analysis of individual events and include only a few specific instruments, which may not provide the full picture that might be needed to clearly identify such individual events with a particular class or another. Therefore, to enhance the scientific return from the entire fleet of existing solar observing instruments, it is essential to develop a special database which will efficiently provide researchers with observational data for specific classes of flares produced by various missions and instruments.

·         Dimensionality and Scalability Issues in High Dimensional Spaces

(Research conducted in collaboration with Michael Houle, University of Melbourne, Australia)

For many fundamental operations in the areas of search and retrieval, data mining, machine learning, multimedia, recommendation systems, and bioinformatics, the efficiency and effectiveness of implementations depends crucially on the interplay between measures of data similarity and the features by which data objects are represented. When the number of features (the data dimensionality) is high, similarity values tend to concentrate strongly about their means, a phenomenon commonly referred to as the curse of dimensionality. As the dimensionality increases, the discriminative ability of similarity measures diminishes to the point where methods that depend on them lose their effectiveness. One fundamental task, arising in applications of multimedia, data mining and machine learning, and other disciplines, is that of content-based similarity search. For such applications, features are often sought so as to provide the best possible coverage across a range of anticipated queries. However, for any given query, only a relatively small number of features may turn out to be relevant. When the dimensionality is high, the errors introduced into similarity measurements by the many irrelevant feature attributes can completely overwhelm the contributions of the relevant features. We have proposed successful techniques for local feature selection with application to search and clustering. Given a dataset in a high dimensional space, we currently looking for representations in lower dimensions that are more discriminative using machine learning technics.

Research Funding

  • 2021-2026 NJIT Secure Computing Initiative (Renewal), NSF, $4,500,000, Principal Investigator
  • 2020-2023 Machine Learning Tools for Predicting Solar Energetic Particle Hazards, NASA, $549,998.00, Co-Principal Investigator
  • 2019-2022 EarthCube Data Capabilities: Machine Learning Enhanced Cyberinfrastructure for Understanding and Predicting the Onset of Solar Eruptions, NSF, $800000, Co-Principal Investigator
  • 2017-2020 EarthCube Data Infrastructure: Intelligent Databases and Analysis Tools for Geospace Data, NSF, $500,000, Co-Principal Investigator
  • • 2016-2021 NJIT Secure Computing Initiative, NSF, $4,000,000, Principal Investigator
  • 2015-2017 Multi-instrument database of solar flares, NASA $200,000, Co-Principal Investigator
  • 2012-2016 Collaborative Project: Integrating Learning Resources for Information Security Research and Education (iSECURE), National Science Foundations (NSF), Principal Investigator, $418,154
  • 2006-2009 Event-Based Fusion of Distributed Multimedia Data Sources, funded by DoD-Army Research Laboratories (ARL) as part of the KIMCOE center of excellence, Morgan State University
  • 2004-2008 General Recommendation Engine, funded by NSF