Bulletin of the Astronomical Society of India
P. Škoda* and
Mihir Arjunwadkar2
Astronomical Institute of the Czech Academy of Sciences, Fricova 298, Ondrejov
View Full Article: [PDF]
The archives of multi-object spectral surveys such as SDSS
or LAMOST currently contain millions of pipeline-reduced spectra of celestial
objects. Most ca be identified as stars of recognised spectral types,
according to quick comparisons with extensive lists of template spectra.
To date, the dominant application of spectral libraries is for statistic estimates
of similarity, measured in a sequential or simply parallel manner, by
comparing all the survey spectra and their PCA components with a grid of
templates.
In this paper we propose a new approach that uses modern machinelearning
techniques as semi-supervised training, deep learning, or outlier
detecting that helps to identify specific rare cases of unusual objects like
stars with strong emission lines or P-Cyg profiles, or blazars, as well as to
eliminate the instrumental and processing artefacts which cannot be handled
correctly by a normal streaming pipeline. The amount of data and
time-absorbing algorithms require a ‘Big Data’ approach, using massively
parallel processing in the cloud by applying modern technologies such as
GPGPUs, Hadoop and Spark.
An important stage towards verifying the results is an interactive
visualisation and cross-matching with other data such as photometric surveys,
spectra acquired by other surveys, space missions and multi-wavelength
data of similar coverage, as well as comparisons with alternative models.
All this can be easily achieved through correct exploitation of Virtual Observatory
standards.
<< Previous Article | Next Article >> Back to Asics_Vol_014
Keywords : stars: emission-line, Be; methods: data analysis; techniques: spectroscopic; virtual observatory; machine learning