Anonymous

Hi! I have a model with some Optional fields, but I noticed something, and I want to be sure it's Pony related and not the result of something I did. Suppose I have a model with some required fields and also with X = Optional(date) and Y = Optional(str). Then I create an instance of the model, without providing values for X and Y I noticed that in that case X is assigned a None value, but Y is stored as an empty string. Is that how Pony works? Thanks!

Alexander

Yes, this is how Pony and most other ORM works: https://docs.ponyorm.com/api_reference.html#optional-string-attributes It is possible to use two different values to represent missing strings - empty string and null. It may be confusing if both empty string and null are used in the same column, because some queries may return confusing results. So it is better to use just a single way to represent missing string values, and disallow another way for consistency. It is easier to disallow null than to disallow empty string values. Also, empty strings can be indexed, whereas null values are not indexed.

Anonymous

Ahhh ok

Anonymous

It makes sense

Anonymous

Thanks for the clarification!

Matthew

Is JOIN() no longer in the documentation? I just spent 10 minutes looking on the site and googling but I can't find it?

Matthew

I'm sure I'm just overlooking something

Alexey

Is JOIN() no longer in the documentation? I just spent 10 minutes looking on the site and googling but I can't find it?

Don't see it either. Thanks for noticing, will add it to the docs

Matthew

I'm trying to speed up a query like this: select(r for r in R if r.p in u.ps) u.ps is approx 3000 items, so there is a big IN part to the query, JOIN() doesn't seem to help, what can I do?

Matthew

Table is approx 2 million rows, with 3000 ps, it takes 25 - 30 seconds

Matthew

ah with inversing the IN and JOIN it gets a lot faster

Alexey

try this way select(r for r in R if exists(u for u in U if r.p == u.ps))

Alexey

ah, wait, u.ps is a set

Matthew

Yep :)

Matthew

I think it's a database issue fundamentally

Matthew

50% of execution time is being taken by filtering a double precision column for some reason

Matthew

-> Seq Scan on review r (cost=0.00..148774.57 rows=1782971 width=474) Filter: ((review_rating >= 1::double precision) AND (review_rating <= 5::double precision))

Matthew

which means i probably need to adjust indexes

Alexey

ok, I see

Henri

Is it possible to make non persistent changes to a table? E.g. creating an entity instance for the need of the current request without writing it to the database? Something like fortunes = Fortune.select()[:] fortunes.append(Fortune(message='Additional fortune added at request time.')) When the model looks like class Fortune(db.Entity): message = Optional(str)

Alexander

If you want to pass the list to some frontend layer with some additional element, maybe you can convert objects to tuples or dictionaries. Something like that: fortunes = Fortune.select() fortune_list = [f.to_dict(["id", "message"]) for f in fortunes] fortune_list.append({id: additional_id, message: additional_message})

Alexander

I'm trying to speed up a query like this: select(r for r in R if r.p in u.ps) u.ps is approx 3000 items, so there is a big IN part to the query, JOIN() doesn't seem to help, what can I do?

Matthew, you can rewrite that query to convert big IN list to subquery. Something like that: select(r for r in R if exists(p for p in P if p.some_attr == u)) It is hard to write exact query without seeing your models, but something like that should work

Henri

Thanks! That works for me.😊

Andrey

Hi! Maybe I have a strange question, but is it possible for an entity with a composite primary key to have a self-reference where one part is a common table? Like for the following situation: CREATE TABLE regions ( file_id INTEGER NOT NULL REFERENCES files (id), region_id INTEGER NOT NULL, PRIMARY KEY (file_id, region_id), outer_region_id INTEGER, FOREIGN KEY (file_id, outer_region_id) REFERENCES regions (file_id, region_id), total_lines INTEGER );(outer_region must be in the same file)

Andrey

I've tried the following code, but received pony.orm.core.DBSchemaError: Column 'file_id' already exists in table 'regions': class Region(db.Entity): _table_ = "regions" file = Required(File, column="file_id") region_id = Required(int, column="region_id") PrimaryKey(file, region_id) outer_region = Optional("Region", reverse="inner_regions", columns=["file_id", "outer_region_id"]) inner_regions = Set("Region", reverse="outer_region") total_lines = Optional(int)

Matthew

2017-06-24 15:21:52 UTC LOG: duration: 3846.543 ms statement: UPDATE "reviewproduct" SET "first_review_scrape" = false WHERE "id" = 2301 AND "asin" = 'XXX' AND "country_code" = 'US' AND "first_review_scrape" = false How come pony is adding those WHERE clauses, instead of just specifying the ID? It's the only cause I can think of for a simple UPDATE taking nearly 4 seconds?

Matthew

this happens with: product.first_review_scrape = False as the only relevant code

Matthew

if I do: if product.first_review_scrape: product.first_review_scrape = False Then that slow update isn't issued, is it by design for pony to issue an update even when the value stays the same (False in this case)?

Alexander

Hi Matthew! > How come pony is adding those WHERE clauses, instead of just specifying the ID? Pony doing this in order to do optimistic updates. Concurrent transactions should be isolated from each other. There are two ways to achieve such isolation. The first one is to use locks. It may be slow, because if one transaction locked a row in a table, other transactions need to wait until that lock will be released in order to access that row. The second way is to use optimistic updates, this is what Pony tries to do. When current transaction read some attributes of an object, it remember the values of these attributes. If it is necessary to update some attributes of this object, Pony issues an UPDATE query with additional checks that the attributes which were read by application code still contain the same values. If another concurrent transaction already updated some of this attributes, the UPDATE statement will not work and Pony will know that two different transactions are trying to change the same object using a potentially inconsistent way to do it. In other words: if you application code read x.a and x.b and then decided to change x.c, but another concurrent transaction already changed x.a or x.b, then new value of x.c`may be wrong because it does not take into account updated value of `x.a or x.b > It's the only cause I can think of for a simple UPDATE taking nearly 4 seconds? I don't think that an additional check can slow down an UPDATE query. Typically, additional constraints can only speed up the query, because they give to the database an additional ways to optimise that query. Speaking about your query, I'm pretty sure that the database is smart enough to understand that id column is primary key and found row using access by id, and check another column when row is already found without any significant slowdown. The only reason I can think why your UPDATE query may be slow is that some concurrent transaction have locked that row and then wait 4 seconds before commit. Typically if some transaction issue UPDATE query it should commit as fast as possible in order to release locks and make that row available to another transactions. > is it by design for pony to issue an update even when the value stays the same Yes, it is by design. Pony sees that the application code want to have that flag to be set to True after the transaction is completed. Even if that flag was set to True initially, it is possible that some concurrent optimistic transaction already set it to False. So if Pony does not issue UPDATE query, assuming that the flag value is already True, the end result may be wrong.

Matthew

ok, I had guessed it was by design :)

Matthew

the if statement fixes it, so it isn't a serious problem

Alexander

It is still interesting why UPDATE query takes so long time (if it is really the UPDATE query which adds slowdown). Maybe concurrent transactions do something suboptimal

Alexander

Do you use db_session with some additional flags?

Matthew

It should be the only transaction using that row, as it is a background task set by a scheduler, so no double accesses

Matthew

no, just plain db_session

Matthew

I am spending the last evening of my vacation "fire fighting" postgres performance, it is a love / hate relationship :)

Matthew

i am hoping the other problem is just a case of adding a better index

Alexander

I wish you success with that

Matthew

select((x.date, x.a) for x in X) Is there a cheap way to get max one result per unique date, with the lowest "a" value? I'm currently doing a blind select, with no limit, but it is too slow

Alexander

select((x.date, min(x.a)) for x in X)

Matthew

Thank you! I was overcomplicating it

Matthew

the problem was not solved by a new index or clever code, but by a manual VACUUM

Alexander

You mean slow UPDATE query?

Matthew

no, my other problem, that was related to min() query

Matthew

There had been no autovacuum for 12 days, the table has a lot of daily data churn

Alexander

I think it should be possible to set up autovacuum

Matthew

It is, and I have tweaked it before, but it didn't work, so a cronjob makes sure that it is happening regularly

Anonymous

Hi

Anonymous

I'm reading through the docs

Anonymous

Quick question

Anonymous

Is it possible to generate/define entities automatically?]

Matthew

from what source of information?

Anonymous

from the database itself

Anonymous

based on its schema

Alexander

Right now no, but we are working on it. Early prototype should be available before the end of this week.

Anonymous

Thanks!!

Anonymous

Amazing work :)

Juan

I think that this project is amazing! Congratulations for everyone that help and comment errors. Thank you for the project

Alexey

Arturo @juan_castano and everyone here - Thank you for your support!

Lucky

Wooh!

Alexander

Guys, I need to reset my telegram account (I forgot my Telegram cloud password, and it cannot be recovered). Hope my previous messages will not be deleted 😅

Henri

I have a model like class World(db.Entity): randomnumber = Optional(int) When doing id_ = randint(1, 10000) return {'id': id_, 'randomNumber': self[id_].randomNumber} I get the error TypeError: 'World' object does not support indexing Any idea what's going on?

Henri

BTW self == World

Matthew

is it an instance of World, or the class World?

Alexander

Replace self[id_] to self.__class__[id_]

Matthew

and maybe World.select().random(1) would be better?

Alexander

Probably so

Matthew

Have you already created the 10,000 rows?

Henri

Have you already created the 10,000 rows?

Good question! The table seems to be empty. So this will be the issue.

Matthew

What general problem are you trying to solve?

Henri

Add Morepath framework with PonyORM to http://frameworkbenchmarks.readthedocs.io

Matthew

So you need a reliable record of 10,000 ID -> random ints ?

Henri

Yeah there is a SQL file I need to import I think.

Matthew

import random class World(db.Entity): random_number = Optional(int) @classmethod def get_random_number(cls): return cls.select().random()[0] @classmethod def create_random_numbers(cls): for _ in range(10000): cls(random_number=random.randrange(0, 100)) commit()

Matthew

untested

Matthew

assumes you want random numbers to be 0 - 99 in range

Henri

and maybe World.select().random(1) would be better?

You mean random_world = World.select().random(1) return { 'id': random_world.id, 'randomNumber': random_world.randomNumber }

Henri

BTW after importing the tables I get now a really strange error when trying the above (Partly translated from German): pony.orm.core.TransactionIntegrityError: Object World[new:1] cannot be stored in the database. IntegrityError: ERROR: NULL-value in row »id« hurts Not-Null-Constraint DETAIL: failed line contains (null, 0). No idea where this comes from as I'm not inserting anything.

Henri

Replace self[id_] to self.__class__[id_]

This seems to work. Thanks. (Still getting the above error. But also get it when using 'World' directly.

Henri

OK this is fixed now! 😃