[WP3] AWBASE and AssociateList
Kor G. Begeman
kgb at astro.rug.nl
Fri Mar 19 08:16:20 CET 2004
Fedor,
the slow response is mainly due to the python interface to oracle. Fetching each
row separately takes a lot more time than fetching the complete column, so I've
implemented just before my holidays two weeks ago the function
get_attrributes_of_pairs. The following example shows the difference, starting
with your example code:
++++++++++++++++++++++++++++++++++++++++
from Experimental.AssociateList import *
from time import *
# First Fedor's test
AL = (AssociateList.ALID == 40)[0]
t=time()
( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
print 'get_pairs took ', time()-t, ' seconds'
t1=time()
for i in range(5):
t=time()
print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
print "----------------"
for i in range(10):
t=time()
print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
print 'Total time needed: ', time()-t1, ' seconds'
d1 = { 'RA': [], 'DEC': [], 'SID': [] }
t=time()
aids = AL.get_attributes_of_pairs( d1 )
print 'get_attributes_of_pairs took ', time()-t, ' seconds'
t1=time()
for i in range(5):
t=time()
print i, SID1[i], d1['RA'][i], time()-t
print "----------------"
for i in range(10):
t=time()
print i, SID1[i], d1['RA'][i], time()-t
print 'Total time needed: ', time()-t1, ' seconds'
----------------------------------------
The results of the above code is the following:
++++++++++++++++++++++++++++++++++++++++
get_pairs took 21.3978278637 seconds
0 4 85.030808326 18.9332048893
1 4 85.030808326 0.000309944152832
2 8 84.97373298 5.94629907608
3 12 85.0029018978 5.70203304291
4 13 84.5810009913 5.91029787064
----------------
0 4 85.030808326 0.000305891036987
1 4 85.030808326 0.00025200843811
2 8 84.97373298 0.000249147415161
3 12 85.0029018978 0.000250101089478
4 13 84.5810009913 0.000272989273071
5 14 84.5800842374 5.9364991188
6 16 85.0032172429 5.71908092499
7 18 84.9222978506 5.75219202042
8 23 84.9953589695 6.23765897751
9 26 84.5978693438 6.03286409378
Total time needed: 66.1724839211 seconds
get_attributes_of_pairs took 32.2162249088 seconds
0 4 85.030808326 9.17911529541e-05
1 4 85.030808326 6.07967376709e-05
2 8 84.97373298 5.48362731934e-05
3 12 85.0029018978 5.29289245605e-05
4 13 84.5810009913 5.3882598877e-05
----------------
0 4 85.030808326 5.69820404053e-05
1 4 85.030808326 5.31673431396e-05
2 8 84.97373298 5.41210174561e-05
3 12 85.0029018978 5.29289245605e-05
4 13 84.5810009913 5.31673431396e-05
5 14 84.5800842374 5.19752502441e-05
6 16 85.0032172429 5.29289245605e-05
7 18 84.9222978506 5.41210174561e-05
8 23 84.9953589695 5.31673431396e-05
9 26 84.5978693438 5.19752502441e-05
Total time needed: 0.00702881813049 seconds
----------------------------------------
Your suggestion of retrieving all sourcelist data into memory is not advisable
since it might eat up al available memory on your machine. Just retrieving the
data that you really want in one oracle query seems to be a much better solution.
At the moment I am working on implementing something similar for singles,
get_attributes_of_singles.
I hope this helps in speeding up things,
Kor.
Fedor I. Getman wrote:
> Good day Danny,
>
> We trying to match some catalogs and found bad thing:
> then we access to list of resulting pairs, produced by
> AssociateList.get_pairs (also the same for singles).
> We have very slow response (approx 2-4 sec to initiate each
> first time accessed list item).
>
> we use something like:
>
> from Experimental.AssociateList import *
> from time import *
> AL = (AssociateList.ALID == 4)[0]
> ( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
> for i in range(5):
> t=time()
> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
> print "----------------"
> for i in range(10):
> t=time()
> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
>
>
>
> and tipical result:
>
> 0 245 186.202586063 17.0450201035
> 1 248 186.025597431 2.86219286919
> 2 249 186.172369307 3.74087810516
> 3 250 186.062742686 3.92979979515
> 4 255 186.188157002 3.66110897064
> ----------------
> 0 245 186.202586063 0.000329971313477
> 1 248 186.025597431 0.000154972076416
> 2 249 186.172369307 0.000148057937622
> 3 250 186.062742686 0.000146150588989
> 4 255 186.188157002 0.000161170959473
> 5 256 186.300818927 4.09186816216
> 6 257 185.890934955 3.21303105354
> 7 259 186.032008037 3.62871098518
> 8 262 186.183697329 3.29224586487
> 9 266 185.838021412 3.08625602722
>
> Seems, that awe do sql query to initialise first time accessed object.
> Can You or Kor change this and fill list by data from DB on creation list
> object? Or add metod "fill" or "retrieve".
>
> I hope that 1 sql request and parsing its result will be much faster then
> do thousand sql requests for each list item.
>
>
> On Wed, 17 Mar 2004, Danny R. Boxhoorn wrote:
>
>
>>Hello again Fedor,
>>
>>This morning Ewout and I discussed the filenaming convention and consequently
>>had a look at the use of swarp in RegriddedFrame. We came to the conclusion
>>that a "clean" solution using symbolic links for the weight frames was feasible.The solution is "clean" in the sense that the SwarpConfig in the database
>>contains the configuration that was used to run swarp. Ewout has implemented
>>this, so it's now in cvs as AWBASE. The implementation also uses links to the
>>RegriddedFrames (or ScienceFrames), which means you can coadd about 450 images
>>(I think) before you hit the next limit.
>>
>>Ciao,
>>
>> Danny
>>
>>
>>On Tue, Mar 16, 2004 at 11:35:08PM +0100, FО©╫dor Getman wrote:
>>
>>>Good evening Danny!
>>>
>>>On Tue, 16 Mar 2004, Danny R. Boxhoorn wrote:
>>>
>>>
>>>>Please contact Emmanuel directly that you'd like to see the limits increased
>>>>(and at least give a decent warning when a limit is exceeded [)
>>>
>>>Ok.
>>>
>>>
>>>>>Enlarging buffer for strings not complitely solve our problem.
>>>>>You can see: correct way is using not 256 but PATH_MAX
>>>>>(in linux=4095) as possible filename length. Also in swarp max amount files
>>>>>defined by MAXINFIELD is 8192. So we recieve 4095*8192=32MB stored in stack.
>>>>>But default value allocated for stack is 8 MB and "Segmentation fault" :(
>>>>>
>>>>>(Exist another restricton: ARG_MAX for exec() allow only 128 kB data pass
>>>>>to external command)
>>>>
>>>>True and that's a problem I'd like Emmanuel to solve. We simply didn't want
>>>>to wait for him to do that and went for the quickest solution that worked.
>>>>You're free to not use swarp until it gets fixed ... `)
>>>
>>>:)
>>>
>>>Yes, as temporary workaround it's work. But for future i prefer have more
>>>robust solution. May be Bertin implement multiline definiton for keyword.
>>>
>>>
>>>
>>>>>In swarp exist workaround: weigthmap files must have the same basename as
>>>>>scienses with different extension (defined in swarp.conf, default .weight.fits).
>>>>>So after retrieving weithmap fits, we can link (or rename) them in
>>>>>something like
>>>>>Sci-TIG-WFI-----#854-ccd57-Regr--Sci-53079.4259863.weight.fits
>>>>>instead
>>>>>Cal-TIG-WFI-----#854-ccd57-Regr--Wei-53079.4259863.fits
>>>>>
>>>>>Or we can change filename agreement and put "weight" keyword as presuffix.
>>>>
>>>>The link or rename will not work because the SwarpConfig is also stored
>>>>in the database and it would refer to the "wrong" weight image.
>>>
>>>In case "suffix" definition of weigthmap files, in config we put only
>>>suffix, not filename list.
>>>
>>>
>>>>As far as I'm concerned it's fine to change the filename convention and
>>>>I'll talk to Erik about it tomorrow.
>>>
>>>Good luck!
>>>
>>>
>>>>>Also analogical problem possible can rise for FSCALE_DEFAULT list. Why not
>>>>>put this parameter in header of science fits?
>>>>
>>>>Indeed, now I do not remember what the reason was to avoid doing that.
>>>>Could you please implement it and commit it if it works?
>>>
>>>I will do and test this tomorrow.
>>>
>>>Ciao,
>>> Fedor
>>>
>>>----------------------------------------
>>> Fedor I. Getman
>>>----------------------------------------
>>>INAF (Istituto Nazionale di AstroFisica)
>>>Osservatorio Astronomico di Capodimonte
>>>via Moiariello 16, I-80131 Napoli, Italy
>>>----------------------------------------
>>>tel/fax: +39-081-5575445/456710
>>>e-mail: tig at na.astro.it
>>>----------------------------------------
>>
>
> ----------------------------------------
> Fedor I. Getman
> ----------------------------------------
> INAF (Istituto Nazionale di AstroFisica)
> Osservatorio Astronomico di Capodimonte
> via Moiariello 16, I-80131 Napoli, Italy
> ----------------------------------------
> tel/fax: +39-081-5575445/456710
> e-mail: tig at na.astro.it
> ----------------------------------------
--
Dr. K.G. Begeman
OmegaCEN
Kapteyn Institute
University of Groningen
Postbus 800 NL-9700 AV Groningen
Landleven 12 NL-9747 AD Groningen
The Netherlands
Telephone +31-(0)50-3634059/4073
Telefax +31-(0)50-3636100
e-Mail kgb at astro.rug.nl
WWW http://www.astro.rug.nl/~kgb
More information about the WP3
mailing list