HAT-P-44 b analysis (new z)

Parameter estimation of transiting exoplanet properties from the light curve

References

In [1]:
!python --version
Python 2.7.13 :: Continuum Analytics, Inc.
In [2]:
import os
os.environ['CONDA_DEFAULT_ENV']
Out[2]:
'moscatel'
(1) File name of the light curves 

  Each file is named by the following format:

  lc[type(f or m)]_[instrument]_[band]_[target name]_[date]_t[target star ID]_c[comparison star IDs]_r[range of aperture radii].bjd.dat

  e.g., 
    lcf_msct_g_hatp12_170124_t1_c23_r28-36.bjd.dat

    meanings:
      type: f 
            (f: flux scale, m: magnitude scale)
      instrument: msct
      band: g
      target name: hatp12
      date: 170124
      target star ID: 1
      comparison star IDs: 2 & 3
      range of aperture radii: 28--36 (pix)

  * ".bjd." means that all time stamps were converted into BJD(TDB).


(2) Format of type "f" (flux scale) light curves 

  column1: Time (BJD_TDB-2450000)
  column2: Airmass
  column3: Mean sky background for all stars (ADU/pix)
  column4: Relative stellar displacement in X direction (pixel)
  column5: Relative stellar displacement in Y direction (pixel)
  column6: Mean FWHM of stellar PSFs (pixel)
  column7: Peak count of the brightest star (ADU)
  column8: Frame name
  column(9+i*2): Relative, normalized (so that the median is equal to unity), and un-detrended flux for the given aperture radius 
  column(9+i*2+1): Theoretical 1-sigma uncertainty of the relative flux for the given aperture radius (including photon noises of the stars, sky background noises, scintillation noises, readout noises, and dark noises) 


(3) Format of type "m" (magnitude scale) light curves

  column1: Time (BJD_TDB-2450000)
  column2: Airmass
  column3: Mean sky background for all stars ( mag = -2.5*log(sky(ADU)) )
  column4: Relative stellar displacement in X direction (pixel)
  column5: Relative stellar displacement in Y direction (pixel)
  column6: Mean FWHM of stellar PSFs (pixel)
  column7: Peak count of the brightest star ( mag = -2.5*log(peak(ADU)) )
  column8: Frame name
  column(9+i*3): raw magnitude of the target star for the given aperture radius, calculated by mag_t = -2.5*log(raw flux(ADU)) 
  column(9+i*3+1): raw magnitude of the ensemble of comparison stars for the given aperture radius, calculated by mag_c = -2.5*log(raw flux(ADU)) 
  column(9+i*3+2): Theoretical 1-sigma uncertainty of the magnitude difference between the target and compariosn stars for the given aperture radius, calculated by mag_t-c_err = sqrt(mag_t_err^2 + mag_c_err^2)

  The "raw magnitude" light curve can be used to partly correct the second-order extinction effect by fiting the coefficient of mag_c (without using approximated airmass values), which gives at least as well as, and in most cases better than, the conventional airmass correction to a relative, already-devided light curve. See Equation (8) of Fukui et al. 2016b:
    http://adsabs.harvard.edu/abs/2016AJ....152..171F

  This method is encouraged to be used as it is more physically motivated than the conventional airmass correction.

import and parse csv

In [3]:
import glob
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
#import multiprocessing
In [4]:
import getpass
from tqdm import tqdm

data_dir = '/home/'+getpass.getuser()+'/data/transit/hatp44_data/lc_hatp44_msct_170215'
file_list=glob.glob(data_dir+'/*.dat')
file_list.sort()

import lc

In [5]:
name='hatp44'
date='170215'
target_star_id='2'
comparison_star_id='1'#13, #3
radii_range='9-14'

DF={}
bands='g,r,z'.split(',')
for b in bands:
    fname='lcf_msct_'+b+'_'+name+'_'+date+'_t'+target_star_id+'_c'+comparison_star_id+'_r'+radii_range+'.bjd.dat'
    df=pd.read_csv(os.path.join(data_dir,fname), delimiter=' ', parse_dates=True)
    df = df.set_index('BJD(TDB)-2450000')
    try:
        df=df.drop('Unnamed: 20',1)
    except:
        pass
    #df.head()
    DF[b]=df
    
DF['r'].head()
Out[5]:
airmass sky(ADU) dx(pix) dy(pix) fwhm(pix) peak(ADU) frame flux(r=9.0) err(r=9.0) flux(r=10.0) err(r=10.0) flux(r=11.0) err(r=11.0) flux(r=12.0) err(r=12.0) flux(r=13.0) err(r=13.0) flux(r=14.0) err(r=14.0)
BJD(TDB)-2450000
7800.099496 1.9221 1782.6 0.341 -0.030 9.81 14002.5 MSCL1_1702150908 1.005844 0.002321 1.005843 0.002334 1.006287 0.002362 1.006163 0.002397 1.006110 0.002440 1.005704 0.002488
7800.099878 1.9173 1775.3 0.382 0.367 10.38 12722.7 MSCL1_1702150909 0.995720 0.002319 0.995471 0.002326 0.995978 0.002349 0.996411 0.002382 0.996416 0.002422 0.997111 0.002470
7800.100283 1.9121 1771.2 -0.132 0.665 9.98 14281.8 MSCL1_1702150910 0.996442 0.002294 0.998204 0.002309 0.998730 0.002334 0.998893 0.002368 0.998776 0.002408 0.998273 0.002453
7800.100688 1.9070 1775.2 0.514 0.060 9.93 13703.6 MSCL1_1702150911 0.995879 0.002290 0.996798 0.002303 0.997262 0.002328 0.998325 0.002364 0.998750 0.002406 0.999398 0.002455
7800.101093 1.9019 1780.5 0.552 -0.179 9.93 13262.4 MSCL1_1702150912 1.003584 0.002317 1.003529 0.002327 1.003996 0.002351 1.004103 0.002385 1.003611 0.002425 1.003570 0.002472

r-band

In [6]:
#fluxes
df=DF['r']
fluxes=df[df.columns[7:][::2]]
fluxes=fluxes[np.abs(fluxes-fluxes.mean())<=(3*fluxes.std())]
#add vertical offset
offset=0.05
for i,col in enumerate(fluxes.columns):
    fluxes[col].apply(lambda x : x-offset*i).plot(figsize=(15,10))
In [7]:
#errors
errors=df[df.columns[8:][::2]]
errors=errors[np.abs(errors-errors.mean())<=(3*errors.std())] 
errors.plot(figsize=(15,5))
Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f2520d8d7d0>
In [8]:
#centroids
cols='dx(pix),dy(pix)'.split(',')
df[cols].plot(figsize=(15,5),subplots=True)
Out[8]:
array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f25234fe5d0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f2520bf6f90>], dtype=object)

3 bands one star

In [9]:
df.columns
Out[9]:
Index([u'airmass', u'sky(ADU)', u'dx(pix)', u'dy(pix)', u'fwhm(pix)',
       u'peak(ADU)', u'frame', u'flux(r=9.0)', u'err(r=9.0)', u'flux(r=10.0)',
       u'err(r=10.0)', u'flux(r=11.0)', u'err(r=11.0)', u'flux(r=12.0)',
       u'err(r=12.0)', u'flux(r=13.0)', u'err(r=13.0)', u'flux(r=14.0)',
       u'err(r=14.0)'],
      dtype='object')

all bands

In [10]:
#add vertical offset
offset=0.01
col_id=df.columns[7] #flux(r=9.0)
colors='g,r,b'.split(',')
for i,(key,c) in enumerate(zip(sorted(DF.keys()),colors)):
    df=DF[key]
    fluxes=df[df.columns[7:][::2]]
    fluxes=fluxes[np.abs(fluxes-fluxes.mean())<=(3*fluxes.std())]
    fluxes[col_id].apply(lambda x : x-offset*i).plot(color=c,figsize=(15,5),label=key)
plt.legend()
Out[10]:
<matplotlib.legend.Legend at 0x7f2520a68950>
In [11]:
#from astropy.stats import sigma_clipped_stats

idx=6 #flux(r=9.0)
key='z'
df=DF[key]
#move frame column
df=df.drop('frame', axis=1)
#remove outliers
df=df[np.abs(df-df.mean())<=(3*df.std())]
#dropna
df=df.dropna(axis=0)

f=df[df.columns[idx]]
f.plot(figsize=(15,5), marker='o', alpha=0.5)
Out[11]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f251efc87d0>
In [12]:
#clip the right side of baseline
clip=40
#df_z=df_z[:-clip]

t=df.index.values
f=df[df.columns[idx]].values

fig = plt.figure(figsize=(15,5))
plt.plot(t, f, 'o', alpha=0.5)
Out[12]:
[<matplotlib.lines.Line2D at 0x7f251edf2f50>]

Previous results: g, r

k: 0.1373, 0.1412
tc: 7800.2192, 7800.2195
a: 12.3386, 11.5423
i: 1.5545, 1.5426
u1: 1.0015, 0.5617
u2: -0.5596, 0.0278
ls: -6.3270, -6.2819
k0: -0.0017, -0.0006
k1: 0.0007, -0.0000
k2: -0.0005, -0.0001
k3: 3.3006, -3.1566
k4: 0.0000, 0.0000
In [13]:
def scaled_a(p, t14, k, i=np.pi/2, b=0):
    numer = np.sqrt( (k + 1)**2 - b**2 )
    denom = np.sin(i) * np.sin(t14 * np.pi / p)
    return float(numer / denom)

tc = t.mean()
p = 4.3 #fixed
k = 0.1373 #np.sqrt(0.015)
inc = 1.55 #np.pi/2
t14 = 2.8/24
u1 = 1.0 #0.4
u2 = -0.55 #0.4
a = scaled_a(p, t14, k, i=np.pi/2)
# ls = np.log(f.std())
# k0 = 0
# theta = [k,tc,a,i,u1,u2,ls,k0]
theta = [k,tc,a,inc,u1,u2]
print ("initial guess: {}".format(theta))
initial guess: [0.1373, 7800.241880286103, 13.358965960191169, 1.55, 1.0, -0.55]
In [14]:
from pytransit import MandelAgol
MA = MandelAgol()

def model_u(theta, t, p):
    k,tc,a,i,u1,u2 = theta
    m = MA.evaluate(t, k, (u1, u2), tc, p, a, i)
    return m

fig = plt.figure(figsize=(15,5))
plt.plot(t, f, 'ko', t, model_u(theta, t, p), 'r-');

MLE

  • p, period
  • k, r_planet/r_star
  • t14, transit duration
  • i, inclination
  • b, impact paramter
  • tc, mid-point transit time
  • u1, u2, limb-darkening coeffs
  • a_scaled
In [15]:
import scipy.optimize as op

def obj(theta, t, p, f):
    m = model_u(theta, t, p)
    return np.sum((m-f)**2)

print ("cost before: {}".format(obj(theta, t, p, f)))
res = op.minimize(obj, theta, args=(t, p, f), method='nelder-mead')

print ("cost after: {}".format(obj(res.x, t, p, f)))
cost before: 0.0196533383864
cost after: 0.00243099847809
In [16]:
fig = plt.figure(figsize=(15,5))
plt.plot(t, f, 'ko', t, model_u(res.x, t, p), 'r-');
In [17]:
DF[key].columns
Out[17]:
Index([u'airmass', u'sky(ADU)', u'dx(pix)', u'dy(pix)', u'fwhm(pix)',
       u'peak(ADU)', u'frame', u'flux(r=9.0)', u'err(r=9.0)', u'flux(r=10.0)',
       u'err(r=10.0)', u'flux(r=11.0)', u'err(r=11.0)', u'flux(r=12.0)',
       u'err(r=12.0)', u'flux(r=13.0)', u'err(r=13.0)', u'flux(r=14.0)',
       u'err(r=14.0)'],
      dtype='object')
In [18]:
#df_z=DF[key]

#excluded outliers
uncertainty=df['err(r=9.0)'].values#.dropna()
peak_flux=df['peak(ADU)'].values#.dropna()
fwhm = df['fwhm(pix)'].values#.dropna()
xcenter = df['dx(pix)'].values#.dropna()
ycenter = df['dy(pix)'].values#.dropna()
In [19]:
np.c_[fwhm, xcenter, ycenter].shape
Out[19]:
(367, 3)
In [20]:
#systematics parameters
k0, k1, k2, k3, k4 = [0]*5
#log flux uncertainty
ls = np.log(np.nanstd(f))

#parameters vector: 6 free, 7 input
theta = [k,tc,a,inc,u1,u2,ls,k0,k1,k2,k3,k4]

#systematics model (time-dependent)
def model_s(theta, fwhm, uncertainty, xcenter, ycenter, t):
    #functional form of systematics model
    #dummy = np.ones(len(fwhm))
    s = (np.array(theta)*np.c_[fwhm, xcenter, ycenter, uncertainty, t]).sum(axis=1)
    # unpack 6 free params
    #a,b,c,d,e = theta
    #s = a + b*t + c*fwhm + d*xcenter + e*ycenter
    return s

def loglike(theta, t, f, p, fwhm, uncertainty, xcenter, ycenter, ret_mod=False, ret_sys=False, ret_full = False):
    ls = theta[6]
    m = model_u(theta[:6], t, p)
    s = model_s(theta[7:], fwhm, uncertainty, xcenter, ycenter, t) # #add sys model
    
    if ret_mod:
        return m
    if ret_sys:
        return s
    if ret_full:
        return m+s
    
    resid = f - m - s
    
    inv_sig2 = np.exp(-2*ls)
    
    return -0.5*(np.sum((resid)**2 * inv_sig2 + 2*ls))

#negative log-likelihood
nll = lambda *x: -loglike(*x)

print ("NLL before: {}".format(nll(theta, t, f, p, fwhm, uncertainty, xcenter, ycenter)))
result = op.minimize(nll, theta, args=(t, f, p, fwhm, uncertainty, xcenter, ycenter), method='nelder-mead')
print ("NLL after: {}".format(nll(result.x, t, f, p, fwhm, uncertainty, xcenter, ycenter)))
NLL before: -1599.90196743
NLL after: -1987.49186084
In [21]:
f_pred = loglike(result.x, t, f, p, fwhm, uncertainty, xcenter, ycenter, ret_full=True)
plt.plot(t, f, 'ko', t, f_pred, 'r-')
Out[21]:
[<matplotlib.lines.Line2D at 0x7f25076cadd0>,
 <matplotlib.lines.Line2D at 0x7f25076caed0>]

Maximum A Priori Estimation

In [22]:
def logprob(theta, t, f, p, fwhm, uncertainty, xcenter, ycenter, up=None):

    k,tc,a,i,u1,u2,ls,k0,k1,k2,k3,k4 = theta

    if u1 < 0 or u1 > 2 or u2 < -1 or u2 > 1 or k < 0 or k > 1 or i > np.pi/2:
        return -np.inf

    lp = 0
    #u prior
    if up is not None:
        lp += np.log(stats.norm.pdf(u1, loc=up[0], scale=up[1]))
        lp += np.log(stats.norm.pdf(u2, loc=up[2], scale=up[3]))

    ll = loglike(theta, t, f, p, fwhm, uncertainty, xcenter, ycenter)

    if np.isnan(ll).any():
        return -np.inf
    
    #total: sum of prior and likelihood
    return lp + ll

#negative log prob
nlp = lambda *x: -logprob(*x)

print ("NLP before: {}".format(nlp(theta, t, f, p, fwhm, uncertainty, xcenter, ycenter)))
res = op.minimize(nlp, theta, args=(t, p, f, fwhm, uncertainty, xcenter, ycenter), method='nelder-mead')
print ("NLP after: {}".format(nlp(res.x, t, f, p, fwhm, uncertainty, xcenter, ycenter)))
NLP before: -1599.90196743
NLP after: 1413498496.73
In [23]:
plt.plot(t, f, 'ko', t, loglike(res.x, t, f, p, fwhm, xcenter, uncertainty, ycenter, ret_full=True), 'r-')
Out[23]:
[<matplotlib.lines.Line2D at 0x7f250760be90>,
 <matplotlib.lines.Line2D at 0x7f250760bf90>]

MCMC

Note the difference in interpretation of their results.

In [24]:
from emcee import MHSampler, EnsembleSampler
from emcee.utils import sample_ball
from tqdm import tqdm

ndim = len(theta)
nwalkers = 8 *