Project

General

Profile

Summer final results » History » Version 7

Version 6 (Shantanu Desai, 07/25/2013 08:03 AM) → Version 7/16 (Shantanu Desai, 07/25/2013 08:04 AM)

h1. Summer final results

*Stage 1: Creating Colored JPGs of Panstarrs Data for Planck's Unconfirmed Clusters*

All information is located in /data1/users/hhead/CLUSTER_CANDIDATE_JPEGS/

Included in this directory is the main_script.sh that makes the jpegs, an automator_script.sh which takes in a list of cluster IDs to mass-produce jpegs, a chart with information on each cluster (ex. if a cluster is visible or if photometric issues are present within the image), three directories named for the types of classification (NOT_CLUSTERS, POTENTIAL_CLUSTERS, and UNCERTAIN_CANDIDATES), and a README.txt to explain how to work the codes.

*Stage 2: Source Extraction Code* Code

The code for making the SExtractor catalog files can be found at /home/moon/hhead/nextstage.sh

This code goes through the whole process of finding the images of the cluster, unpacking them, running the SExtractor script, and putting these files into a new directory.

*Stage 3: Data Ingestion

Before the unfortunate crash of the database, the next step before SLR could be run was using a data ingestion process. This was done using /home/moon/hhead/ingestclusterdata.sh.

*Stage 4: SLR

Part 1:
Originally, SLR was completed using the script at /home/moon/hhead/PHOT_CAL/runslr.sh
A slight mistype in one of the options meant that many had to be redone, but shortly after beginning this process, the database crashed. Therefore, a new method had to be found to continue with the SLR process.

Part 2:
A Python code was eventually found for the purposes of this project. The code is located at /home/moon/hhead/PHOT_CAL/SDSS_SLR/big-macs-calibrate/fit_locus.py. And example of how to run this code can be found in example 3 of the README file within this same directory. In order to run this code, we had to make a columns file, a filters file (available at /home/moon/hhead/PHOT_CAL/SDSS_SLR/big-macs-calibrate/FILTERS/SDSS-?.res) , and test the boostrap option to get the best results.

Example of the command used for running our code: "python fit_locus.py --file catalog_101.fits --columns SDSS.columns --extension 1 --bootstrap 2 -l -r RA -d DEC -j "

Before this code can be run though, the script /home/moon/hhead/PHOT_CAL/SDSS_SLR/big-macs-calibrate/SDSS_makingplotinfo.sh must be run on the cluster. This takes the SExtractor catalogs and manipulates them into the correct order of data and right format of file for the code to work correctly. After the SLR code is successful, the script in the same directory called createinputforredsequencer.sh must also be run for the next step, as this code takes what has been output from the SLR code and manipulates it into yet another order and format for the red-sequencing code to use.

*Stage 5: Red-sequencer code

This code was developed by Christina Hennig and Jiayi Liu. This code is a two-part process. The first part is the checkphotoz, located at /home/moon/hhead/package/. It intakes the R500, BCG_RA, BCG_DEC, and an output file name to make a rough estimate of the redshift for a cluster. Next, the Python code plotZ.py (also located in the same directory) takes in a file name and outputs both a graph from which to determine the best likelihood color combination and sends to the screen the uncertainties on the peak likelihood in each color combination. In total for the unconfirmed clusters, the process goes as follows:

First, training is necessary to know what kind of general redshift range the cluster will be in. This involves viewing jpegs of clusters of a range of redshift values and learning what general look certain redshift ranged clusters will have. For such training, jpegs are available in /home/moon/hhead/ORIGINAL_JPEGS/ and the Preliminary Results page of this wiki gives known redshift values for clusters.
Once the user has a general knowledge of the looks of these clusters, the next step is to view a jpeg created of the cluster candidate and make an educated guess as to the redshift range.
Then, the user can run the checkphotoz code and python plotZ.py code.
Example: "./checkphotoz 746.galaxies 3.957633 97.75285243587 -14.83486888914 pz746"
"python plotZ.py --file pz746"
Once these are complete, the output to the screen should show the uncertainties. If a color combination shows '2.0' or '0.0' then an error has occurred and must be looked into. Otherwise, these values show the uncertainties for each peak. From here, the next step is to display the plot created.
Example: "display pz746.png"
This plot will give the Gaussian peaks of the redshift vs. likelihood. If the value of the redshift is known or estimated to be within the range of 0.0 to 0.3, the color combinations to use are g-r and g-i, regardless of whether the peaks in r-i, r-z, or i-z are higher. If the redshift is 0.31 to 0.5, use r-i and r-z, excluding the height of peaks in the other color combinations. The color combination i-z is to be used for very high redshifts. Now, within these ranges, you will use the peak that is highest between the color combinations available for that redshift range.
Example: redshift is estimated to be 0.23. One would ignore the color combination peaks of r-i, r-z, and i-z, even if r-z showed a very high peak. Between g-r and g-i, g-i shows the higher peak, so that's the color combination to go with.
With a color combination file determined, the user can then view the data file that corresponds to that file. They are named pz<cluster>_<number>_bg.dat
Example: "vi pz746_1_bg.dat"
The numbers 0 through 4 are used to designate the files. Thus, starting with g-r = 0, and ending with i-z = 4.

In this file, the user should find where the highest value occurs in the second column, which corresponds to the redshift value in the first column. This, along with the error given for that color combination originally displayed in the terminal after the plotZ.py script was run, is the value of the redshift for that cluster.

Troubleshooting:

As can be expected, real science data isn't going to be perfect! Should the python script give '2.0' or '0.0' for the errors, or the graph simply be a mess with badly placed Gaussians, there could be an issue with the data itself. The way to check this is with region file sanity checks. If too few galaxies exist within the region of the supposed cluster, if another cluster is within the radius of cluster candidate, or if it's a crowded field with lots of stars or bright objects overlapping or near the cluster itself, this code will not give good results. In such a case, other methods might need to be explored. However, this can be checked by viewing the region of the cluster with a region file of the galaxies present and seeing what could be the case. While this does not offer a solution to the problem, it does allow for an explanation for why this code may not work.
The code to create region files can be found at /home/moon/hhead/createregionfile.sh These region files should be copied into their respective UNCONFIRMED_CLUSTERS directories, and then /data1/users/hhead/DATA_FOR_CLUSTERS/UNCONFIRMED_CLUSTERS/imagemaker.sh should be used to create a color image with the region of the cluster and the extracted galaxies marked. From here, the image can be checked for issues mentioned previously.

Stage 6: Future Work

From here, methods need to be found to which the redshift values for these unconfirmed clusters can be found, if this is possible.

Redmine Appliance - Powered by TurnKey Linux