[WP3] AWBASE and AssociateList
Kor G. Begeman
kgb at astro.rug.nl
Fri Mar 19 12:18:14 CET 2004
Hi Folks,
I just added the method get_attributes_of_singles to the AssociateList class. I
also updated the Manual. I now start working on Marks problem reported yesterday.
Cheers,
Kor
Kor G. Begeman wrote:
> Fedor,
>
> the slow response is mainly due to the python interface to oracle.
> Fetching each
> row separately takes a lot more time than fetching the complete column,
> so I've
> implemented just before my holidays two weeks ago the function
> get_attrributes_of_pairs. The following example shows the difference,
> starting with your example code:
>
> ++++++++++++++++++++++++++++++++++++++++
> from Experimental.AssociateList import *
> from time import *
>
> # First Fedor's test
> AL = (AssociateList.ALID == 40)[0]
> t=time()
> ( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
> print 'get_pairs took ', time()-t, ' seconds'
> t1=time()
> for i in range(5):
> t=time()
> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
> print "----------------"
> for i in range(10):
> t=time()
> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
> print 'Total time needed: ', time()-t1, ' seconds'
>
> d1 = { 'RA': [], 'DEC': [], 'SID': [] }
>
> t=time()
>
> aids = AL.get_attributes_of_pairs( d1 )
> print 'get_attributes_of_pairs took ', time()-t, ' seconds'
>
> t1=time()
> for i in range(5):
> t=time()
> print i, SID1[i], d1['RA'][i], time()-t
> print "----------------"
> for i in range(10):
> t=time()
> print i, SID1[i], d1['RA'][i], time()-t
>
> print 'Total time needed: ', time()-t1, ' seconds'
> ----------------------------------------
>
> The results of the above code is the following:
>
> ++++++++++++++++++++++++++++++++++++++++
> get_pairs took 21.3978278637 seconds
> 0 4 85.030808326 18.9332048893
> 1 4 85.030808326 0.000309944152832
> 2 8 84.97373298 5.94629907608
> 3 12 85.0029018978 5.70203304291
> 4 13 84.5810009913 5.91029787064
> ----------------
> 0 4 85.030808326 0.000305891036987
> 1 4 85.030808326 0.00025200843811
> 2 8 84.97373298 0.000249147415161
> 3 12 85.0029018978 0.000250101089478
> 4 13 84.5810009913 0.000272989273071
> 5 14 84.5800842374 5.9364991188
> 6 16 85.0032172429 5.71908092499
> 7 18 84.9222978506 5.75219202042
> 8 23 84.9953589695 6.23765897751
> 9 26 84.5978693438 6.03286409378
> Total time needed: 66.1724839211 seconds
> get_attributes_of_pairs took 32.2162249088 seconds
> 0 4 85.030808326 9.17911529541e-05
> 1 4 85.030808326 6.07967376709e-05
> 2 8 84.97373298 5.48362731934e-05
> 3 12 85.0029018978 5.29289245605e-05
> 4 13 84.5810009913 5.3882598877e-05
> ----------------
> 0 4 85.030808326 5.69820404053e-05
> 1 4 85.030808326 5.31673431396e-05
> 2 8 84.97373298 5.41210174561e-05
> 3 12 85.0029018978 5.29289245605e-05
> 4 13 84.5810009913 5.31673431396e-05
> 5 14 84.5800842374 5.19752502441e-05
> 6 16 85.0032172429 5.29289245605e-05
> 7 18 84.9222978506 5.41210174561e-05
> 8 23 84.9953589695 5.31673431396e-05
> 9 26 84.5978693438 5.19752502441e-05
> Total time needed: 0.00702881813049 seconds
> ----------------------------------------
>
> Your suggestion of retrieving all sourcelist data into memory is not
> advisable since it might eat up al available memory on your machine.
> Just retrieving the data that you really want in one oracle query seems
> to be a much better solution.
>
> At the moment I am working on implementing something similar for singles,
> get_attributes_of_singles.
>
>
> I hope this helps in speeding up things,
>
>
> Kor.
>
>
> Fedor I. Getman wrote:
>
>> Good day Danny,
>>
>> We trying to match some catalogs and found bad thing:
>> then we access to list of resulting pairs, produced by
>> AssociateList.get_pairs (also the same for singles).
>> We have very slow response (approx 2-4 sec to initiate each
>> first time accessed list item).
>>
>> we use something like:
>>
>> from Experimental.AssociateList import *
>> from time import *
>> AL = (AssociateList.ALID == 4)[0]
>> ( ( SL1, SID1 ), ( SL2, SID2 ) ) = AL.get_pairs()
>> for i in range(5):
>> t=time()
>> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
>> print "----------------"
>> for i in range(10):
>> t=time()
>> print i, SID1[i], SL1.sources[SID1[i]]['RA'], time()-t
>>
>>
>>
>> and tipical result:
>>
>> 0 245 186.202586063 17.0450201035
>> 1 248 186.025597431 2.86219286919
>> 2 249 186.172369307 3.74087810516
>> 3 250 186.062742686 3.92979979515
>> 4 255 186.188157002 3.66110897064
>> ----------------
>> 0 245 186.202586063 0.000329971313477
>> 1 248 186.025597431 0.000154972076416
>> 2 249 186.172369307 0.000148057937622
>> 3 250 186.062742686 0.000146150588989
>> 4 255 186.188157002 0.000161170959473
>> 5 256 186.300818927 4.09186816216
>> 6 257 185.890934955 3.21303105354
>> 7 259 186.032008037 3.62871098518
>> 8 262 186.183697329 3.29224586487
>> 9 266 185.838021412 3.08625602722
>>
>> Seems, that awe do sql query to initialise first time accessed object.
>> Can You or Kor change this and fill list by data from DB on creation list
>> object? Or add metod "fill" or "retrieve".
>>
>> I hope that 1 sql request and parsing its result will be much faster then
>> do thousand sql requests for each list item.
>>
>>
>> On Wed, 17 Mar 2004, Danny R. Boxhoorn wrote:
>>
>>
>>> Hello again Fedor,
>>>
>>> This morning Ewout and I discussed the filenaming convention and
>>> consequently
>>> had a look at the use of swarp in RegriddedFrame. We came to the
>>> conclusion
>>> that a "clean" solution using symbolic links for the weight frames
>>> was feasible.The solution is "clean" in the sense that the
>>> SwarpConfig in the database
>>> contains the configuration that was used to run swarp. Ewout has
>>> implemented
>>> this, so it's now in cvs as AWBASE. The implementation also uses
>>> links to the
>>> RegriddedFrames (or ScienceFrames), which means you can coadd about
>>> 450 images
>>> (I think) before you hit the next limit.
>>>
>>> Ciao,
>>>
>>> Danny
>>>
>>>
>>> On Tue, Mar 16, 2004 at 11:35:08PM +0100, FО©╫dor Getman wrote:
>>>
>>>> Good evening Danny!
>>>>
>>>> On Tue, 16 Mar 2004, Danny R. Boxhoorn wrote:
>>>>
>>>>
>>>>> Please contact Emmanuel directly that you'd like to see the limits
>>>>> increased
>>>>> (and at least give a decent warning when a limit is exceeded [)
>>>>
>>>>
>>>> Ok.
>>>>
>>>>
>>>>>> Enlarging buffer for strings not complitely solve our problem.
>>>>>> You can see: correct way is using not 256 but PATH_MAX
>>>>>> (in linux=4095) as possible filename length. Also in swarp max
>>>>>> amount files
>>>>>> defined by MAXINFIELD is 8192. So we recieve 4095*8192=32MB stored
>>>>>> in stack.
>>>>>> But default value allocated for stack is 8 MB and "Segmentation
>>>>>> fault" :(
>>>>>>
>>>>>> (Exist another restricton: ARG_MAX for exec() allow only 128 kB
>>>>>> data pass
>>>>>> to external command)
>>>>>
>>>>>
>>>>> True and that's a problem I'd like Emmanuel to solve. We simply
>>>>> didn't want
>>>>> to wait for him to do that and went for the quickest solution that
>>>>> worked.
>>>>> You're free to not use swarp until it gets fixed ... `)
>>>>
>>>>
>>>> :)
>>>>
>>>> Yes, as temporary workaround it's work. But for future i prefer have
>>>> more
>>>> robust solution. May be Bertin implement multiline definiton for
>>>> keyword.
>>>>
>>>>
>>>>
>>>>>> In swarp exist workaround: weigthmap files must have the same
>>>>>> basename as
>>>>>> scienses with different extension (defined in swarp.conf, default
>>>>>> .weight.fits).
>>>>>> So after retrieving weithmap fits, we can link (or rename) them in
>>>>>> something like
>>>>>> Sci-TIG-WFI-----#854-ccd57-Regr--Sci-53079.4259863.weight.fits
>>>>>> instead
>>>>>> Cal-TIG-WFI-----#854-ccd57-Regr--Wei-53079.4259863.fits
>>>>>>
>>>>>> Or we can change filename agreement and put "weight" keyword as
>>>>>> presuffix.
>>>>>
>>>>>
>>>>> The link or rename will not work because the SwarpConfig is also
>>>>> stored
>>>>> in the database and it would refer to the "wrong" weight image.
>>>>
>>>>
>>>> In case "suffix" definition of weigthmap files, in config we put only
>>>> suffix, not filename list.
>>>>
>>>>
>>>>> As far as I'm concerned it's fine to change the filename convention
>>>>> and
>>>>> I'll talk to Erik about it tomorrow.
>>>>
>>>>
>>>> Good luck!
>>>>
>>>>
>>>>>> Also analogical problem possible can rise for FSCALE_DEFAULT list.
>>>>>> Why not
>>>>>> put this parameter in header of science fits?
>>>>>
>>>>>
>>>>> Indeed, now I do not remember what the reason was to avoid doing that.
>>>>> Could you please implement it and commit it if it works?
>>>>
>>>>
>>>> I will do and test this tomorrow.
>>>>
>>>> Ciao,
>>>> Fedor
>>>>
>>>> ----------------------------------------
>>>> Fedor I. Getman
>>>> ----------------------------------------
>>>> INAF (Istituto Nazionale di AstroFisica)
>>>> Osservatorio Astronomico di Capodimonte
>>>> via Moiariello 16, I-80131 Napoli, Italy
>>>> ----------------------------------------
>>>> tel/fax: +39-081-5575445/456710
>>>> e-mail: tig at na.astro.it
>>>> ----------------------------------------
>>>
>>>
>>
>> ----------------------------------------
>> Fedor I. Getman
>> ----------------------------------------
>> INAF (Istituto Nazionale di AstroFisica)
>> Osservatorio Astronomico di Capodimonte
>> via Moiariello 16, I-80131 Napoli, Italy
>> ----------------------------------------
>> tel/fax: +39-081-5575445/456710
>> e-mail: tig at na.astro.it
>> ----------------------------------------
>
>
--
Dr. K.G. Begeman
OmegaCEN
Kapteyn Institute
University of Groningen
Postbus 800 NL-9700 AV Groningen
Landleven 12 NL-9747 AD Groningen
The Netherlands
Telephone +31-(0)50-3634059/4073
Telefax +31-(0)50-3636100
e-Mail kgb at astro.rug.nl
WWW http://www.astro.rug.nl/~kgb
More information about the WP3
mailing list