Christian

Hi! New here, and loving PonyORM so far! You guys are doing an outstanding job! I'm working with to-many relationships and ran into a pretty tough wall when trying to traverse relationships.

Christian

Specifically, I want to filter by a collections property, but doing so from the parent object and with the search parameters being variable. I tried to explain it better on SO: http://stackoverflow.com/questions/56312903

Matthew

select(g for G in Game if g.game_type.short == game_type) is an alternative simpler way to write it

Matthew

so the filters can be just "lambda g:"

Matthew

no tuples

Alexander

Yes, I think you can use attribute lifting, as Matthew suggests. But you need to use in instead of ==, as g.game_type is a collection attribute in your schema (and should be named game_types instead) g = query.filter(lambda g: game_type in g.game_types.short) g.game_types is a set of Type entities, and g.game_type.short is a multiset of strings

Christian

@matthewrobertbell I tried the same but was stuck with the == comparison. This looks much better already, thanks @metaprogrammer !

Matthew

🙂

Alexander

I wrote answer on StackOverflow with more examples: https://stackoverflow.com/questions/56312903/query-with-optional-arguments-using-ponyorm/56321738#56321738

Christian

Amazing! Thanks for the quick help! Happy customer :)

Lucky

(gotta love legacy)

Hey, at least you can use python. *Shudders in PHP*

Jacob

The legacy software I'm working around is written in Java--very disorganized Java with only hints of separation-of-concern. For my sanity, I'm writing what I need in Python 3.7: OpenAPI + connexion + aiohttp + punq + pony + CQRS (without event sourcing due to the legacy software I have to work in parallel with).

banteg

how do i cast a json field to decimal type? i tried Decimal(log.args['wad']) / Decimal('1e18') which raises a type error. casting to float works fine but i need precision.

Alexander

Hi! What database do you use?

banteg

postgres

banteg

some more samples: Decimal('1e18') works Decimal(log.args['wad']) type error raw_sql("cast(x.args #> '{wad}' as decimal) / 1e18") obviously works raw_sql("cast(x.args #> '{wad}' as decimal)") / Decimal('1e18') type error on division

Alexander

I think at this moment you need to use raw_sql fragment. We can add conversion to decimal later #todo

Lucky

The legacy software I'm working around is written in Java--very disorganized Java with only hints of separation-of-concern. For my sanity, I'm writing what I need in Python 3.7: OpenAPI + connexion + aiohttp + punq + pony + CQRS (without event sourcing due to the legacy software I have to work in parallel with).

aiohttp? So you use async? How does that cooperate with pony?

Иван

aiohttp? So you use async? How does that cooperate with pony?

From my experience, badly Because you can't have more than 1 database connection in your async app So, transactions from different places at the same time will be done as one transaction from one connection. And that may lead to behavior like read uncommitted

Иван

Of course, you can use loop.run_in_executor to deal with it. It creates another thread and runs your query with new database connection

Christian

Just made a recurring donation. Hope it helps a little to keep up the great support and pony development!

Alexander

Thanks! Your donation is greatly appreciated!

Christian

👍

Jacob

From my experience, badly Because you can't have more than 1 database connection in your async app So, transactions from different places at the same time will be done as one transaction from one connection. And that may lead to behavior like read uncommitted

Oh dear. I will have to test that to make sure I'm not shooting myself in the foot. I'm in early development, so all my pytest tests have been running in serial.

Matthew

def base_filtered_reviews_query(self, category_slug, min_stars, max_stars, min_date=None, max_date=None): q = select(r for r in Review if exists(m for m in ReviewProductUserMapping if m.product == r.product and m.user == self and m.category_slug == category_slug)) if (min_stars, max_stars) != (1, 5): q = q.filter(lambda r: r.review_rating >= min_stars and r.review_rating <= max_stars) if min_date is not None: q = q.filter(lambda r: r.review_date >= min_date) if max_date is not None: q = q.filter(lambda r: r.review_date <= max_date) return q def recent_reviews(self, category_slug, min_stars, max_stars, min_date=None, max_date=None, limit=100, page=None): q = self.base_filtered_reviews_query(category_slug=category_slug, min_stars=min_stars, max_stars=max_stars, min_date=min_date, max_date=max_date).order_by(Review.review_date.desc(), Review.id.desc()) if page is None: return q.limit(limit)[:] else: return q.page(pagenum=page, pagesize=limit)[:]

Matthew

That code produces this SQL query:

Matthew

SELECT "r"."id", "r"."created", "r"."review_url", "r"."reviewer_name", "r"."reviewer_url", "r"."review_rating", "r"."review_date", "r"."review_title", "r"."review_text", "r"."amazon_review_id", "r"."review_verified", "r"."product_attributes", "r"."variation_asin", "r"."deleted", "r"."product" FROM "review" "r" WHERE EXISTS ( SELECT 1 FROM "reviewproductusermapping" "m" WHERE "m"."product" = "r"."product" AND "m"."user" = 1 AND "m"."category_slug" = '' ) ORDER BY "r"."review_date" DESC, "r"."id" DESC LIMIT 1000000

Matthew

(edited to inject parameters directly)

Matthew

Calling recent_reviews in python takes about 40 seconds, running the query directly in psql takes about 1.25 seconds

Matthew

There are about 218k items

Matthew

Any idea why it is much slower with python / pony?

Matthew

I measured the time on the python side with: import time print 'reviews 1'; t = time.time() reviews = current_user.recent_reviews(category_slug=category_slug, min_stars=min_stars, max_stars=max_stars, min_date=min_date, max_date=max_date, limit=1000000) print 'reviews 2', time.time() - t

Alexander

I think it may be the time of constructing 218k objects in memory

Matthew

I was afraid you'd say that 🙂

Alexander

Maybe it is not necessary to construct all that objects at once?

Matthew

It is generating a CSV file which is directly returned to the user

Matthew

so I think it does need to be done at once

Matthew

and most / all of the columns which it selects are used

Matthew

it generates approx 100MB CSV for 218k objects

Matthew

Is there a "lower level" pony mode that I could use to avoid the cost of making the proper pony objects?

Alexander

You can do sql = query.get_sql() rows = db.select(sql) for row in rows: print(r[0], r[1], r[2]) # or, alternatively print(r.id, r.created, r.review_url)

Matthew

If the row object has attributes like created or review_url, how is the object different from a normal pony object?

Alexander

It is read-only and its attributes are plain db values (identifiers instead of linked objects)

G

GitHub

Matthew

ah thank you, I will try it, probably easier than rewriting the csv generation in rust 🙂

Matthew

AttributeError: 'QueryResult' object has no attribute 'get_sql'

Matthew

q = q.limit(limit) sql = q.get_sql() return db.select(sql)

Matthew

Any idea what's wrong?

Alexander

query.filter(...) result type is Query query.limit(...) result type is QueryResult It is possible to add get_sql to QueryResult, but right now it is missing #todo Right now you need to call get_sql before limit, and then append "LIMIT ..." suffix manually to SQL Or you can call internal method sql, _, _, _ = query._construct_sql_and_arguments(limit, offset)

Matthew

thank you very much!

Matthew

Could you please explain how to pass the bindings to db.select(sql) ?

Matthew

return db.select(sql, globals={'user': self.id, 'category_slug': category_slug}) was my guess but it didn't work

Alexander

Sorry, I was wrong. You need to do the following: # limit & offset are optional sql, arguments, _, _ = query._construct_sql_and_arguments(limit, offset) connection = db.get_connection() cursor = connection.cursor() cursor.execute(sql, arguments) rows = cursor.fetchall() Each row is a tuple, if you use PostgreSQL and want to access columns by name, you can do import psycopg2.extras cursor = connection.cursor( cursor_factory=psycopg2.extras.NamedTupleCursor )

Sigmund

I was thinking as well that if speed is the major concern, @matthewrobertbell might be better off using psycopg directly.

banteg

is there a way to specify column as numeric without scale/precision?

banteg

it's kinda counterintuitive that Decimal results in numeric(12,2) which is an arbitrary choice

Alexander

Did you try this https://docs.ponyorm.org/api_reference.html?highlight=sql_type#cmdoption-arg-sql_type ?

banteg

thanks for pointing me in the right direction, Required(Decimal, sql_type='numeric') seems to do the trick

Anonymous

good morning

Anonymous

I'm unable to download a diagram from the editor as a png

Anonymous

Keep getting 404s

Alexander

Thanks for reporting, we'll fix that. In the meantime, you can download svg, it should work

Anonymous

thanks

Matthew

latest_snapshots = models.select((max(s.id), s.date_preset, s.start_date, s.end_date) for s in models.FacebookDataSnapshot if s.created > datetime.datetime.utcnow() - datetime.timedelta(days=3) and s.facebook_ad_account == facebook_ad_account).order_by(3)[:]

Matthew

That works, but is there a simple way to only get results where there was more than one possible result available, and it chose the one with the largest ID?

Alexander

If I understand correctly, you want to get not one, but multiple results, but for each combination of (s.date_preset, s.start_date, s.end_date) there should be at most one result I'm not sure it may be done simpler because of limitations of SQL syntax

Matthew

Yeah it seems hard to do without multiple queries

Matthew

thanks for looking 🙂

Matthew

(It was a nice to have feature rather than essential for this project)

Alexander

If you later retrieve snapshot objects by id and want to limit the number of queries, then I think it is possible to do in a single query: later_snapshots = models.select( s for s in FacebookDataSnapshot for max_id, preset, start, end in previous_query if s.id == max_id )

Alexander

(fixed typo)

Matthew

Is there any downside to making all primary keys 64 bit "bigints", apart from having to specify the primary keys manually and the increased size in the database?

Alexander

Probably no

Matthew

Thanks 🙂

Paŭlo

Ho folks. What is the equivalent in pony to time_diff in mysql?

Paŭlo

If hace tried with: select(x for x in foo if (datetime.now() - some_datetime).seconds > 20)

Alexander

Hi Paulo! datetime - datetime is not implemented yet. I think you can use raw_sql fragment instead: select(x for x in foo if raw_sql( "TIME_TO_SEC(TIMEDIFF(CURRENT_TIMESTAMP(), x.some_datetime))" ) > 20)

Paŭlo

Thanks you very much @metaprogrammer Is there any way to do it in Python?

Paŭlo

If I call select(...).filter(some_func) is it executed like a nornal func or it is traduced to sql?

Alexander

It is translated to SQL, so you can't just use arbitrary Python