Jump to content

Talk:ORES

Rendered with Parsoid
From Wikitech
Latest comment: 8 months ago by Maxbiohazard in topic Transition to Liftwing

Transition to Liftwing

This page describes transition to LiftWing from streams.wikimedia.org or ores.wikimedia.org. I'm an experienced ORES user (I own a bot rolling back vandal edits in ruwiki, using ORES scores, since 2017), but I have never used streams.wikimedia.org nor ores.wikimedia.org. I have used standard Wikipedia API for collecting oresscores:

https://ru.wikipedia.org/w/api.php?action=query&format=xml&list=recentchanges&rcprop=title%7Ctimestamp%7Coresscores%7Cuser&rclimit=50

or, now, I use an SQL queries to DB replicas on Toolforge: select rc_title, max(case when oresm_name="damaging" then oresc_probability else 0 end) damaging, max(case when oresm_name="goodfaith" then oresc_probability else 0 end) goodfaith, actor_name, rc_this_oldid, actor_user from recentchanges join ores_classification on oresc_rev=rc_this_oldid join actor on actor_id=rc_actor join ores_model on oresc_model=oresm_id where rc_timestamp>yyyyMMddHHmmss and rc_type=0 group by rc_this_oldid having max(case when oresm_name="damaging" then oresc_probability else 0 end)>=lowlimit order by rc_this_oldid desc;

Will this methods work with LiftWing? Will LiftWing scores appear in list=recentchanges API query in Wikipedia? Will LiftWing DB tables be available on Toolforge DB replicas, will it be possible to join that tables with recentchanges table to get LiftWing scores of recent edits? Ping User:Elukey, User:Ilias Sarantopoulos, User:Klausman. Maxbiohazard (talk) 15:44, 23 August 2023 (UTC)Reply

Hi!
Nothing will change regarding the recent changes queries and requests. The deprecation of ORES just means that the same models will be used to fetch the scores for each edit but in the background the ORES Mediawiki extension will be making requests to Lift Wing instead of ores.wikimedia.org.
The ORES extension is responsible for populating the ores_classification table, which will continue to function.
So, your use case will not be affected at all! Ilias Sarantopoulos (talk) 16:01, 23 August 2023 (UTC)Reply
But LiftWing has its own models (agnostic, multilang....), how LiftWing models output will be converted into ORES "damaging/goodfaith" output? Can I use new LiftWing models from rc-ch API and DB replicas? Ping @Ilias Sarantopoulos Maxbiohazard (talk) 17:06, 23 August 2023 (UTC)Reply
All ORES models have been migrated to Lift Wing -> Machine Learning/LiftWing/Inference Services
This means that exactly the same models with the same output are used.
Lift Wing has additional models which cannot be used by rc-ch API at the moment and can only be accessed through the API gateway https://api.wikimedia.org/wiki/Lift_Wing_API/Reference Ilias Sarantopoulos (talk) 17:44, 23 August 2023 (UTC)Reply

Hello again. I tried to use LiftWing and failed. I have used this python code as an example, but my bot is written on C#, so I created this code:

liftwing_client = new HttpClient()
liftwing_client.DefaultRequestHeaders.Add("Authorization", "Bearer " + liftwing_token);
liftwing_client.DefaultRequestHeaders.Add("User-Agent", "vandalism detection tool by user MBH");
var result = liftwing_client.PostAsync("https://api.wikimedia.org/service/lw/inference/v1/models/revertrisk-language-agnostic:predict", new FormUrlEncodedContent(new Dictionary<string, string>{ { "lang", "ru"}, { "rev_id", "136742461" } }));

I'm receiving 400 Bad Request in answer. Could you explain what I did wrong?

Ping User:Elukey, User:Ilias Sarantopoulos, User:Klausman. Maxbiohazard (talk) 14:42, 18 March 2024 (UTC)Reply

Hi @Maxbiohazard! Does the HTTP 400 reply carry an error message? Elukey (talk) 15:04, 18 March 2024 (UTC)Reply
@Elukey it's difficult to answer, I'll just show you my VisualStudio output. I also found a potential error in my code and changed POST line to
var result = liftwing_client.PostAsync("https://api.wikimedia.org/service/lw/inference/v1/models/revertrisk-language-agnostic:predict", JsonContent.Create(new editdata() { lang = "ru", rev_id = "136742461" }));
but it returns 400 too. Also, when I go to https://api.wikimedia.org/service/lw/inference/v1/models/revertrisk-language-agnostic:predict in browser, it says me {"error":"Model with name revertrisk-language-agnostic:predict does not exist."} Maxbiohazard (talk) 16:14, 18 March 2024 (UTC)Reply
@Maxbiohazard the "does not exist" error is due to the fact that the browser issues a GET request, meanwhile we accept only POSTs (so the GET result leads to the error, maybe misleading).
There should be a way in your code to log the message carried by the 400 response, because we usually add a text explaining what went wrong. Please also be mindful when sharing the visual studio's output, the bearer token partially leaked in there (I didn't manage to get it fully because zooming is not easy). Elukey (talk) 16:29, 18 March 2024 (UTC)Reply
I know about token, but it is only part of it. About answer, let's say it in other words: I don't know how to extract or see error message from my result object, I thought maybe you knows. Maxbiohazard (talk) 16:35, 18 March 2024 (UTC)Reply
@Elukey this?
{StatusCode: 400, ReasonPhrase: 'Bad Request', Version: 1.1, Content: System.Net.Http.HttpConnectionResponseContent, Headers: { Date: Mon, 18 Mar 2024 17:39:49 GMT Server: envoy Cache-Control: no-cache x-ratelimit-limit: 5000, 5000;w=3600 x-ratelimit-remaining: 4999 x-ratelimit-reset: 1211 Age: 0 X-Cache: cp3073 pass, cp3073 pass x-cache-status: pass Server-Timing: cache;desc="pass", host;desc="cp3073" Strict-Transport-Security: max-age=106384710; includeSubDomains; preload report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] } nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0} x-client-ip: x.x.x.x Content-Length: 43 Content-Type: application/json }}
System.Net.Http.HttpResponseMessage Maxbiohazard (talk) 17:45, 18 March 2024 (UTC)Reply
. Iluvatar (talk) 01:16, 19 March 2024 (UTC)Reply
@Iluvatar {"httpReason":"Jwt header is an invalid JSON","httpCode":401} Maxbiohazard (talk) 09:56, 19 March 2024 (UTC)Reply
liftwing_client.DefaultRequestHeaders.Add("Authorization", "Bearer my.tok.en"); Iluvatar (talk) 11:01, 19 March 2024 (UTC)Reply