[WP3] AWBASE and AssociateList

Kor G. Begeman kgb at astro.rug.nl
Fri Mar 19 08:16:20 CET 2004


Fedor,

the slow response is mainly due to the python interface to oracle. Fetching each
row separately takes a lot more time than fetching the complete column, so I've
implemented just before my holidays two weeks ago the function 
get_attrributes_of_pairs. The following example shows the difference, starting 
with your example code:

++++++++++++++++++++++++++++++++++++++++
from Experimental.AssociateList import *
from time import *

# First Fedor's test
AL = (AssociateList.ALID == 40)[0]
t=time()
( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
print 'get_pairs took ', time()-t, ' seconds'
t1=time()
for i in range(5):
      t=time()
      print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
print "----------------"
for i in range(10):
      t=time()
      print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
print 'Total time needed: ', time()-t1, ' seconds'

d1 = { 'RA': [], 'DEC': [], 'SID': [] }

t=time()

aids = AL.get_attributes_of_pairs( d1 )
print 'get_attributes_of_pairs took ', time()-t, ' seconds'

t1=time()
for i in range(5):
      t=time()
      print i, SID1[i], d1['RA'][i], time()-t
print "----------------"
for i in range(10):
      t=time()
      print i, SID1[i], d1['RA'][i], time()-t

print 'Total time needed: ', time()-t1, ' seconds'
----------------------------------------

The results of the above code is the following:

++++++++++++++++++++++++++++++++++++++++
get_pairs took  21.3978278637  seconds
0 4 85.030808326 18.9332048893
1 4 85.030808326 0.000309944152832
2 8 84.97373298 5.94629907608
3 12 85.0029018978 5.70203304291
4 13 84.5810009913 5.91029787064
----------------
0 4 85.030808326 0.000305891036987
1 4 85.030808326 0.00025200843811
2 8 84.97373298 0.000249147415161
3 12 85.0029018978 0.000250101089478
4 13 84.5810009913 0.000272989273071
5 14 84.5800842374 5.9364991188
6 16 85.0032172429 5.71908092499
7 18 84.9222978506 5.75219202042
8 23 84.9953589695 6.23765897751
9 26 84.5978693438 6.03286409378
Total time needed:  66.1724839211  seconds
get_attributes_of_pairs took  32.2162249088  seconds
0 4 85.030808326 9.17911529541e-05
1 4 85.030808326 6.07967376709e-05
2 8 84.97373298 5.48362731934e-05
3 12 85.0029018978 5.29289245605e-05
4 13 84.5810009913 5.3882598877e-05
----------------
0 4 85.030808326 5.69820404053e-05
1 4 85.030808326 5.31673431396e-05
2 8 84.97373298 5.41210174561e-05
3 12 85.0029018978 5.29289245605e-05
4 13 84.5810009913 5.31673431396e-05
5 14 84.5800842374 5.19752502441e-05
6 16 85.0032172429 5.29289245605e-05
7 18 84.9222978506 5.41210174561e-05
8 23 84.9953589695 5.31673431396e-05
9 26 84.5978693438 5.19752502441e-05
Total time needed:  0.00702881813049  seconds
----------------------------------------

Your suggestion of retrieving all sourcelist data into memory is not advisable 
since it might eat up al available memory on your machine. Just retrieving the 
data that you really want in one oracle query seems to be a much better solution.

At the moment I am working on implementing something similar for singles,
get_attributes_of_singles.


I hope this helps in speeding up things,


Kor.


Fedor I. Getman wrote:
> Good day Danny,
> 
> We trying to match some catalogs and found bad thing:
> then we access to list of resulting pairs, produced by
> AssociateList.get_pairs (also the same for singles).
> We have very slow response (approx 2-4 sec to initiate each
> first time accessed list item).
> 
> we use something like:
> 
> from Experimental.AssociateList import *
> from time import *
> AL = (AssociateList.ALID == 4)[0]
> ( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
> for i in range(5):
>      t=time()
>      print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
> print "----------------"
> for i in range(10):
>      t=time()
>      print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
> 
> 
> 
> and tipical result:
> 
> 0 245 186.202586063 17.0450201035
> 1 248 186.025597431 2.86219286919
> 2 249 186.172369307 3.74087810516
> 3 250 186.062742686 3.92979979515
> 4 255 186.188157002 3.66110897064
> ----------------
> 0 245 186.202586063 0.000329971313477
> 1 248 186.025597431 0.000154972076416
> 2 249 186.172369307 0.000148057937622
> 3 250 186.062742686 0.000146150588989
> 4 255 186.188157002 0.000161170959473
> 5 256 186.300818927 4.09186816216
> 6 257 185.890934955 3.21303105354
> 7 259 186.032008037 3.62871098518
> 8 262 186.183697329 3.29224586487
> 9 266 185.838021412 3.08625602722
> 
> Seems, that awe do sql query to initialise first time accessed object.
> Can You or Kor change this and fill list by data from DB on creation list
> object? Or add metod "fill" or "retrieve".
> 
> I hope that 1 sql request and parsing its result will be much faster then
> do thousand sql requests for each list item.
> 
> 
> On Wed, 17 Mar 2004, Danny R. Boxhoorn wrote:
> 
> 
>>Hello again Fedor,
>>
>>This morning Ewout and I discussed the filenaming convention and consequently
>>had a look at the use of swarp in RegriddedFrame. We came to the conclusion
>>that a "clean" solution using symbolic links for the weight frames was feasible.The solution is "clean" in the sense that the SwarpConfig in the database
>>contains the configuration that was used to run swarp. Ewout has implemented
>>this, so it's now in cvs as AWBASE. The implementation also uses links to the
>>RegriddedFrames (or ScienceFrames), which means you can coadd about 450 images
>>(I think) before you hit the next limit.
>>
>>Ciao,
>>
>>                                                   Danny
>>
>>
>>On Tue, Mar 16, 2004 at 11:35:08PM +0100, FО©╫dor Getman wrote:
>>
>>>Good evening Danny!
>>>
>>>On Tue, 16 Mar 2004, Danny R. Boxhoorn wrote:
>>>
>>>
>>>>Please contact Emmanuel directly that you'd like to see the limits increased
>>>>(and at least give a decent warning when a limit is exceeded [)
>>>
>>>Ok.
>>>
>>>
>>>>>Enlarging buffer for strings not complitely solve  our problem.
>>>>>You can see: correct way is using not 256 but PATH_MAX
>>>>>(in linux=4095) as possible filename length. Also in swarp max amount files
>>>>>defined by MAXINFIELD is 8192. So we recieve 4095*8192=32MB stored in stack.
>>>>>But default value allocated for stack is 8 MB and "Segmentation fault" :(
>>>>>
>>>>>(Exist another restricton: ARG_MAX for exec() allow only 128 kB data pass
>>>>>to external command)
>>>>
>>>>True and that's a problem I'd like Emmanuel to solve. We simply didn't want
>>>>to wait for him to do that and went for the quickest solution that worked.
>>>>You're free to not use swarp until it gets fixed ... `)
>>>
>>>:)
>>>
>>>Yes, as temporary workaround it's work. But for future i prefer have more
>>>robust solution. May be Bertin implement multiline definiton for keyword.
>>>
>>>
>>>
>>>>>In swarp exist workaround: weigthmap files must have the same basename as
>>>>>scienses with different extension (defined in swarp.conf, default .weight.fits).
>>>>>So after retrieving weithmap fits, we can link (or rename) them in
>>>>>something like
>>>>>Sci-TIG-WFI-----#854-ccd57-Regr--Sci-53079.4259863.weight.fits
>>>>>instead
>>>>>Cal-TIG-WFI-----#854-ccd57-Regr--Wei-53079.4259863.fits
>>>>>
>>>>>Or we can change filename agreement and put "weight" keyword as presuffix.
>>>>
>>>>The link or rename will not work because the SwarpConfig is also stored
>>>>in the database and it would refer to the "wrong" weight image.
>>>
>>>In case "suffix" definition of weigthmap files, in config we put only
>>>suffix, not filename list.
>>>
>>>
>>>>As far as I'm concerned it's fine to change the filename convention and
>>>>I'll talk to Erik about it tomorrow.
>>>
>>>Good luck!
>>>
>>>
>>>>>Also analogical problem possible can rise for FSCALE_DEFAULT list. Why not
>>>>>put this parameter in header of science fits?
>>>>
>>>>Indeed, now I do not remember what the reason was to avoid doing that.
>>>>Could you please implement it and commit it if it works?
>>>
>>>I will do and test this tomorrow.
>>>
>>>Ciao,
>>>	Fedor
>>>
>>>----------------------------------------
>>>            Fedor I. Getman
>>>----------------------------------------
>>>INAF (Istituto Nazionale di AstroFisica)
>>>Osservatorio Astronomico di Capodimonte
>>>via Moiariello 16, I-80131 Napoli, Italy
>>>----------------------------------------
>>>tel/fax:  +39-081-5575445/456710
>>>e-mail:   tig at na.astro.it
>>>----------------------------------------
>>
> 
> ----------------------------------------
>             Fedor I. Getman
> ----------------------------------------
> INAF (Istituto Nazionale di AstroFisica)
> Osservatorio Astronomico di Capodimonte
> via Moiariello 16, I-80131 Napoli, Italy
> ----------------------------------------
> tel/fax:  +39-081-5575445/456710
> e-mail:   tig at na.astro.it
> ----------------------------------------

-- 

Dr. K.G. Begeman
OmegaCEN
Kapteyn Institute
University of Groningen
Postbus 800                         NL-9700 AV Groningen
Landleven 12                        NL-9747 AD Groningen
The Netherlands
Telephone                           +31-(0)50-3634059/4073
Telefax                             +31-(0)50-3636100
e-Mail                              kgb at astro.rug.nl
WWW                                 http://www.astro.rug.nl/~kgb




More information about the WP3 mailing list