Download this as a Jupyter notebook!
Marvin Queries¶
This tutorial goes through a few basics of how to perform queries on the MaNGA dataset using the Marvin Query tool. Please see the Marvin Query page for more details on how to use Queries. This tutorial covers the basics of:
- querying on metadata information from the NSA catalog
- how to combine multiple filter and return additional parameters
- how to perform radial cone searches with Marvin
- querying on information from the MaNGA DAPall summary file
- querying using quality and target flags
First let’s import some basics
In [18]:
# we should be using DR15 MaNGA data
from marvin import config
config.release
# import the Query tool
from marvin.tools.query import Query
Query Basics¶
Querying on Metadata¶
Let’s go through some Query basics of how to do a query on metadata. The
two main keyword arguments to Query are search_filter and
return_params. search_filter is a string representing the SQL
where
condition you’d like to filter on. This tutorial assumes a
basic familiarity with the SQL boolean syntax needed to construct Marvin
Queries. Please see the tutorial on SQL Boolean
syntax to learn more.
return_params is a list of parameters you want to return in the
query in addition to those used in the SQL filter condition.
Let’s search for all galaxies with a redshift less than 0.1. To specify
our search parameter, redshift, we must know the database table and name
of the parameter. In this case, MaNGA uses the NASA-Sloan Atlas (NSA)
for redshift information. In the NSA catalog, the redshift is the z
parameter of the nsa table, so our search parameter will be
nsa.z
. Generically, all search parameters will take the form
table.parameter
.
In [19]:
# filter for galaxies with a redshift < 0.1
my_filter = 'nsa.z < 0.1'
In [20]:
# construct the query
q = Query(search_filter=my_filter)
# run the query
r = q.run()
# print some stuff
print(r)
print('number of results:', r.totalcount)
Marvin Results(query=nsa.z < 0.1, totalcount=4275, count=100, mode=remote)
number of results: 4275
After constructing queries, we can run them with q.run(). This returns a Marvin Results object. Let’s take a look. This query returned 4275 objects. For queries with large results, the results are automatically paginated in sets of 100 objects. Default parameters returned in queries always include the mangaid and plateifu. Marvin Queries will also return any parameters used in the definition of your filter condition. Since we filtered on redshift, the redshift is automatically included.
In [21]:
# look at the current page of results (subset of 10)
print('number in current set:', len(r.results))
print(r.results[0:10])
number in current set: 100
<ResultSet(set=1.0/428, index=0:10, count_in_set=10, total=4275)>
[ResultRow(mangaid='1-109056', plateifu='8077-6103', z=0.0473019),
ResultRow(mangaid='1-109073', plateifu='8078-12704', z=0.0435882),
ResultRow(mangaid='1-109081', plateifu='8077-12705', z=0.0438592),
ResultRow(mangaid='1-109112', plateifu='8078-1901', z=0.0249702),
ResultRow(mangaid='1-109152', plateifu='8154-12702', z=0.0281736),
ResultRow(mangaid='1-109167', plateifu='8078-1902', z=0.0246344),
ResultRow(mangaid='1-109234', plateifu='9193-1901', z=0.0409309),
ResultRow(mangaid='1-109244', plateifu='8080-12702', z=0.0267911),
ResultRow(mangaid='1-109250', plateifu='8080-1902', z=0.0383731),
ResultRow(mangaid='1-109270', plateifu='9193-12703', z=0.0252266)]
Multiple Search Criteria and Returning Additional Parameters¶
We can easily combine query filter conditions by constructing a boolean string using AND. Let’s search for galaxies with a redshift < 0.1 and log M\(_\star\) < 10. The NSA catalog contains the Sersic profile determination for stellar mass, which is the sersic_mass or sersic_logmass parameter of the ``nsa`` table, so our search parameter will be nsa.sersic_logmass.
Let’s also return the object RA and Dec as well using the
return_params keyword. This accepts a list of string parameters.
Object RA and Dec are included in the cube table so the parameter
names are cube.ra
and cube.dec
.
In [22]:
my_filter = 'nsa.z < 0.1 and nsa.sersic_logmass < 10'
q = Query(search_filter=my_filter, return_params=['cube.ra', 'cube.dec'])
r = q.run()
print(r)
print('Number of objects:', r.totalcount)
Marvin Results(query=nsa.z < 0.1 and nsa.sersic_logmass < 10, totalcount=1932, count=100, mode=remote)
Number of objects: 1932
This query return 1932 objects and now includes the RA, Dec, redshift and log Sersic stellar mass parameters.
In [23]:
# print the first 10 rows
r.results[0:10]
Out[23]:
<ResultSet(set=1.0/194, index=0:10, count_in_set=10, total=1932)>
[ResultRow(mangaid='1-109073', plateifu='8078-12704', ra=42.1567757009, dec=-0.554292603774, z=0.0435882, sersic_logmass=9.89949219613813),
ResultRow(mangaid='1-109234', plateifu='9193-1901', ra=45.9974263376, dec=0.426382098814, z=0.0409309, sersic_logmass=9.71193033196987),
ResultRow(mangaid='1-109250', plateifu='8080-1902', ra=47.7852584272, dec=-1.05434100442, z=0.0383731, sersic_logmass=8.86787095265526),
ResultRow(mangaid='1-109270', plateifu='9193-12703', ra=46.6858583566, dec=-0.40880014755, z=0.0252266, sersic_logmass=9.13073892096969),
ResultRow(mangaid='1-109356', plateifu='8081-3703', ra=50.0738407878, dec=0.523632230733, z=0.0241782, sersic_logmass=9.48148132628646),
ResultRow(mangaid='1-113219', plateifu='7815-9102', ra=317.374745914, dec=10.0519434342, z=0.0408897, sersic_logmass=9.37199275559893),
ResultRow(mangaid='1-113322', plateifu='7972-12701', ra=315.533514754, dec=11.0751131018, z=0.0431628, sersic_logmass=9.95351566466319),
ResultRow(mangaid='1-113346', plateifu='7972-12702', ra=315.85387762, dec=10.8770313048, z=0.0431389, sersic_logmass=9.94489259080932),
ResultRow(mangaid='1-113375', plateifu='7815-9101', ra=316.639658795, dec=10.7512221884, z=0.028215, sersic_logmass=9.82192731931789),
ResultRow(mangaid='1-113375', plateifu='7972-12704', ra=316.639658795, dec=10.7512221884, z=0.028215, sersic_logmass=9.82192731931789)]
Radial Queries in Marvin¶
Cone searches can be performed with Marvin Queries using a special
functional syntax in your SQL string. Cone searches can be performed
using the special radial
string function. The syntax for a cone
search query is radial(RA, Dec, radius). Let’s search for all
galaxies within 0.5 degrees centered on RA, Dec = 232.5447, 48.6902. The
RA and Dec must be in decimal degrees and the radius is in units of
degrees.
In [24]:
# build the radial filter condition
my_filter = 'radial(232.5447, 48.6902, 0.5)'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)
Marvin Results(query=radial(232.5447, 48.6902, 0.5), totalcount=2, count=2, mode=remote)
<ResultSet(set=1.0/1, index=0:2, count_in_set=2, total=2)>
[ResultRow(mangaid='1-209232', plateifu='8485-1901', dec=48.6902009334, ra=232.544703894),
ResultRow(mangaid='1-209266', plateifu='8485-9101', dec=48.8332849239, ra=233.107502765)]
Queries using DAPall parameters.¶
MaNGA provides derived analysis properties in its dapall summary
file. Marvin allows for queries on any of the parameters in the file.
The table name for these parameters is dapall. Let’s find all
galaxies that have a total measure star-formation rate > 5
M\(_\odot\)/year. The total SFR parameter in the DAPall table is
sfr_tot
.
In [25]:
my_filter = 'dapall.sfr_tot > 5'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)
Marvin Results(query=dapall.sfr_tot > 5, totalcount=6, count=6, mode=remote)
<ResultSet(set=1.0/1, index=0:6, count_in_set=6, total=6)>
[ResultRow(mangaid='1-24092', plateifu='7991-1901', sfr_tot=5.5633, bintype_name='VOR10', template_name='GAU-MILESHC'),
ResultRow(mangaid='1-24092', plateifu='7991-1901', sfr_tot=5.69988, bintype_name='HYB10', template_name='GAU-MILESHC'),
ResultRow(mangaid='1-37863', plateifu='9193-12704', sfr_tot=6.457, bintype_name='VOR10', template_name='GAU-MILESHC'),
ResultRow(mangaid='1-37863', plateifu='9193-12704', sfr_tot=6.82693, bintype_name='HYB10', template_name='GAU-MILESHC'),
ResultRow(mangaid='1-43214', plateifu='8135-1902', sfr_tot=26.4648, bintype_name='VOR10', template_name='GAU-MILESHC'),
ResultRow(mangaid='1-43214', plateifu='8135-1902', sfr_tot=27.4152, bintype_name='HYB10', template_name='GAU-MILESHC')]
The query returns 6 results, but looking at the plateifu, we see there
are only 3 unique targets. This is because the DAPall file provides
measurements for multiple bintypes and by default will return entries
for all bintypes. We can select those out using the bintype.name
parameter. Let’s filter on only the HYB10 bintype.
In [26]:
my_filter = 'dapall.sfr_tot > 5 and bintype.name==HYB10'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)
Marvin Results(query=dapall.sfr_tot > 5 and bintype.name==HYB10, totalcount=3, count=3, mode=remote)
<ResultSet(set=1.0/1, index=0:3, count_in_set=3, total=3)>
[ResultRow(mangaid='1-24092', plateifu='7991-1901', bintype_name='HYB10', sfr_tot=5.69988, template_name='GAU-MILESHC'),
ResultRow(mangaid='1-37863', plateifu='9193-12704', bintype_name='HYB10', sfr_tot=6.82693, template_name='GAU-MILESHC'),
ResultRow(mangaid='1-43214', plateifu='8135-1902', bintype_name='HYB10', sfr_tot=27.4152, template_name='GAU-MILESHC')]
Query on Quality and Target Flags¶
Marvin includes the ability to perform queries using quality or target flag information. These work using the special quality and targets keyword arguments. These keywords accept a list of flag maskbit labels provided by the Maskbit Datamodel. These keywords are inclusive, meaning they will only filter on objects satisfying those labels.
Searching by Target Flags¶
Let’s find all galaxies that are in the MaNGA MAIN target selection sample. Targets in the MAIN sample are a part of the PRIMARY, SECONDARY and COLOR-ENHANCED samples. These are the primary, secondary, and color-enhanced flag labels. The targets keywords accepts all labels from the MANGA_TARGET1, MANGA_TARGET2, or MANGA_TARGET3 maskbit schema.
In [27]:
# create the targets list of labels
targets = ['primary', 'secondary', 'color-enhanced']
q = Query(targets=targets)
r = q.run()
print(r)
print('There are {0} galaxies in the main sample'.format(r.totalcount))
print(r.results[0:5])
Marvin Results(query=None, totalcount=4498, count=100, mode=remote)
There are 4498 galaxies in the main sample
<ResultSet(set=1.0/900, index=0:5, count_in_set=5, total=4498)>
[ResultRow(mangaid='1-109056', plateifu='8077-6103', manga_target1=2080),
ResultRow(mangaid='1-109073', plateifu='8078-12704', manga_target1=2336),
ResultRow(mangaid='1-109081', plateifu='8077-12705', manga_target1=4112),
ResultRow(mangaid='1-109112', plateifu='8078-1901', manga_target1=1040),
ResultRow(mangaid='1-109152', plateifu='8154-12702', manga_target1=4096)]
The targets keyword is equivalent to the cube.manga_targetX
search parameter, where X
is 1, 2, or 3. The bits for the primary,
secondary, and color-enhanced samples are 10, 11, and 12, respectively.
These combine into the value 7168. The above query is equivalent to the
filter condition cube.manga_target1 & 7168
In [28]:
value = 1<<10 | 1<<11 | 1<<12
my_filter = 'cube.manga_target1 & {0}'.format(value)
q = Query(search_filter=my_filter)
r = q.run()
print(r)
Marvin Results(query=cube.manga_target1 & 7168, totalcount=4498, count=100, mode=remote)
Let’s search only for galaxies that are Milky Way Analogs
or
Dwarfs
ancillary targets.
In [29]:
targets = ['mwa', 'dwarf']
q = Query(targets=targets)
r = q.run()
print(r)
print('There are {0} galaxies from the Milky Way Analogs and Dwarfs ancillary target catalogs'.format(r.totalcount))
print(r.results)
Marvin Results(query=None, totalcount=26, count=26, mode=remote)
There are 26 galaxies from the Milky Way Analogs and Dwarfs ancillary target catalogs
<ResultSet(set=1.0/1, index=0:26, count_in_set=26, total=26)>
[ResultRow(mangaid='1-121994', plateifu='9485-9102', manga_target3=16384),
ResultRow(mangaid='1-124604', plateifu='8439-6103', manga_target3=8192),
ResultRow(mangaid='1-135491', plateifu='9869-6101', manga_target3=8192),
ResultRow(mangaid='1-146067', plateifu='8937-6102', manga_target3=16384),
ResultRow(mangaid='1-187821', plateifu='8996-6104', manga_target3=16384),
ResultRow(mangaid='1-198869', plateifu='8551-6101', manga_target3=16384),
ResultRow(mangaid='1-199488', plateifu='9038-6103', manga_target3=16384),
ResultRow(mangaid='1-218519', plateifu='8942-12703', manga_target3=16384),
ResultRow(mangaid='1-234408', plateifu='8319-6102', manga_target3=8192),
ResultRow(mangaid='1-23818', plateifu='8611-6104', manga_target3=16384),
ResultRow(mangaid='1-276557', plateifu='8993-6101', manga_target3=16384),
ResultRow(mangaid='1-295506', plateifu='9049-6101', manga_target3=16384),
ResultRow(mangaid='1-321962', plateifu='8552-6101', manga_target3=16384),
ResultRow(mangaid='1-338566', plateifu='8566-6103', manga_target3=16384),
ResultRow(mangaid='1-351538', plateifu='8567-6101', manga_target3=8192),
ResultRow(mangaid='1-384388', plateifu='9494-6102', manga_target3=16384),
ResultRow(mangaid='1-384394', plateifu='9494-6103', manga_target3=16384),
ResultRow(mangaid='1-384930', plateifu='9493-12702', manga_target3=16384),
ResultRow(mangaid='1-385149', plateifu='9502-6102', manga_target3=16384),
ResultRow(mangaid='1-386657', plateifu='8987-6101', manga_target3=16384),
ResultRow(mangaid='1-386800', plateifu='8987-6103', manga_target3=16384),
ResultRow(mangaid='1-415958', plateifu='8985-6101', manga_target3=16384),
ResultRow(mangaid='1-457843', plateifu='8982-12702', manga_target3=16384),
ResultRow(mangaid='43-130', plateifu='9507-6104', manga_target3=16384),
ResultRow(mangaid='43-22', plateifu='8713-6101', manga_target3=16384),
ResultRow(mangaid='43-47', plateifu='8725-9101', manga_target3=16384)]
Searching by Quality Flags¶
The quality accepts all labels from the MANGA_DRPQUAL and MANGA_DAPQUAL maskbit schema. Let’s find all galaxies that suffered from bad flux calibration. This is the flag BADFLUX (bit 8) from the MANGA_DRPQUAL maskbit schema.
In [30]:
quality = ['BADFLUX']
q = Query(quality=quality)
r = q.run()
print(r)
print('There are {0} galaxies with bad flux calibration'.format(r.totalcount))
print(r.results[0:10])
Marvin Results(query=None, totalcount=82, count=82, mode=remote)
There are 82 galaxies with bad flux calibration
<ResultSet(set=1.0/9, index=0:10, count_in_set=10, total=82)>
[ResultRow(mangaid='1-109493', plateifu='8156-9101', quality=1073742144),
ResultRow(mangaid='1-113273', plateifu='7972-1902', quality=1073742144),
ResultRow(mangaid='1-120967', plateifu='8144-12704', quality=1073742144),
ResultRow(mangaid='1-121035', plateifu='8144-1901', quality=1073742144),
ResultRow(mangaid='1-136304', plateifu='8606-1902', quality=1073742144),
ResultRow(mangaid='1-138157', plateifu='8252-9102', quality=1073742144),
ResultRow(mangaid='1-149170', plateifu='8997-3701', quality=1073742144),
ResultRow(mangaid='1-152527', plateifu='8144-6101', quality=1073742144),
ResultRow(mangaid='1-152769', plateifu='8936-3704', quality=1073742144),
ResultRow(mangaid='1-174629', plateifu='8947-3703', quality=1073742080)]
The quality keyword is equivalent to the search parameters
cube.quality
for DRP flags or the file.quality
for DAP flags.
The above query is equivalent to cube.quality & 256
. You can also
perform a NOT bitmask selection using the ~
symbol. To perform a NOT
selection we can only use the cube.quality
parameter. Let’s select
all galaxies that do not have bad flux calibration.
In [31]:
# the above query as a filter condition
q = Query(search_filter='cube.quality & 256')
r = q.run()
print('Objects with bad flux calibration:', r.totalcount)
# objects with bad quality other than bad flux calibration
q = Query(search_filter='cube.quality & ~256')
r = q.run()
print('Bad objects with no bad flux calibration:', r.totalcount)
Objects with bad flux calibration: 82
Bad objects with no bad flux calibration: 604
To find exactly objects with good quality and no bad flags set, use
cube.quality == 0
.
In [32]:
q = Query(search_filter='cube.quality == 0')
r = q.run()
print(r)
print('Objects with good quality:', r.totalcount)
Marvin Results(query=cube.quality == 0, totalcount=4253, count=100, mode=remote)
Objects with good quality: 4253
Useful Resources¶
Check out these pages on the Marvin Docs site for more information querying with Marvin.