Congrats on the launch, @ukuina! Neat idea. Would love to learn more about how you are querying the summaries/comments database and how you are planning to extend the queries in this app.
Shameless self-plug: We are building EvaDB [1], a query engine for shipping fast AI-powered apps with SQL. Here is an illustrative query for analyzing food reviews stored in Postgres and generating responses for negative reviews:
SELECT ChatGPT("Respond to the review with a solution to address the reviewer's concern", review)
FROM postgres_data.review_table
WHERE ChatGPT("Is the review positive or negative? Only reply 'positive' or 'negative'.", review) = "negative"
AND location = “waffle house”;
It would be interesting to learn about the queries needed for supporting HackYourNews application. Would love to exchange notes on this if you're up for it!
Probably not the point, but shouldn't you be able to choose to sample only the tokens for "positive" and "negative" (they're both one token!) instead of (or in addition to) needing to put a request for model to restrict its responses in the context?
I guess this is the SQL query you have in mind that uses the LIKE operator:
SELECT ChatGPT("Respond to the review with a solution to address the reviewer's concern", review)
FROM postgres_data.review_table
WHERE ChatGPT("Is the review positive or negative?", review) LIKE "%positive%"
AND location = “waffle house”;
From a query processing standpoint, both queries should have equivalent performance -- unless we build an index over the output of the ChatGPT query in EvaDB, in which case the former query would be faster than this one.
mm, no, not unless you're doing some LIKE-specific optimization (and even then, I think you'd want "positive%").
So like, at the end of all the decoders, the model gives you an output vector; you multiply this by your embeddings to get your token probabilities, then you sample from them to choose a token.
Instead of sampling, you could just look at the probabilities for the tokens "positive" and "negative" and return whichever of those two is highest.
Shameless self-plug: We are building EvaDB [1], a query engine for shipping fast AI-powered apps with SQL. Here is an illustrative query for analyzing food reviews stored in Postgres and generating responses for negative reviews:
It would be interesting to learn about the queries needed for supporting HackYourNews application. Would love to exchange notes on this if you're up for it![1] https://github.com/georgia-tech-db/evadb