Flutter: How to extract user account name and video id from a shortened tiktok url? - flutter

I wanted to extract the real tiktok video link but my code seems to not be working. I want to get the
https://www.tiktok.com/#lilymaycreative/video/6911015584570395906?sender_device=pc&sender_web_id=6894321561748211206&is_from_webapp=v
from the shorten link which is
https://vm.tiktok.com/ZSTjLwCK/
var dio =
Dio(BaseOptions(connectTimeout: 10000, receiveTimeout: 10000, headers: {
'User-Agent':
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
}));
try {
Response response = await dio.get('https://vm.tiktok.com/ZSTjLwCK/');
_document = parse(response.data);
if (_document != null) print(_document);
print(jsonData);
} catch (e) {
print(e);
}

I can tell you a way that can help solve your problem.
If you press F12(developer tools) in your example site.
You detect video tag in html like below and you may access the video link.

Related

Companion Uppy always fails with > 5GB uploads when getting the /complete request

companion: 2022-09-22T23:31:07.088Z [error] a62da431-f9ce-4fae-b18d-dc59189a53ea root.error PayloadTooLargeError: request entity too large
at readStream (/usr/local/share/.config/yarn/global/node_modules/raw-body/index.js:155:17)
at getRawBody (/usr/local/share/.config/yarn/global/node_modules/raw-body/index.js:108:12)
at read (/usr/local/share/.config/yarn/global/node_modules/body-parser/lib/read.js:77:3)
at jsonParser (/usr/local/share/.config/yarn/global/node_modules/body-parser/lib/types/json.js:135:5)
at Layer.handle [as handle_request] (/usr/local/share/.config/yarn/global/node_modules/express/lib/router/layer.js:95:5)
at trim_prefix (/usr/local/share/.config/yarn/global/node_modules/express/lib/router/index.js:317:13)
at /usr/local/share/.config/yarn/global/node_modules/express/lib/router/index.js:284:7
at Function.process_params (/usr/local/share/.config/yarn/global/node_modules/express/lib/router/index.js:335:12)
at next (/usr/local/share/.config/yarn/global/node_modules/express/lib/router/index.js:275:10)
at middleware (/usr/local/share/.config/yarn/global/node_modules/express-prom-bundle/src/index.js:174:5)
::ffff:172.29.0.1 - - [22/Sep/2022:23:31:07 +0000] "POST /s3/multipart/FqHx7wOxKS8ASbAWYK7ZtEfpWFOT2h9KIX2uHTPm2EZ.k1INl8vxfdpH7KBXhLTii1WL7GeDLzLcAKOW0vmxKhfCrcUCRMgHGdxEd5Nwxr._omBrtqOQFuY.Fl9nX.Vy/complete?key=videos%2Fuploads%2Ff86367432cef879b-4d84eb44-thewoods_weddingfilm_1.mp4 HTTP/1.1" 413 211 "http://localhost:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36"
It will always give out a 413 on the last request with about 188k of payload describing all the parts.
I've tried anything including:
var bodyParser = require("body-parser");
app.use(bodyParser.json({ limit: "50mb" }));
app.use(bodyParser.urlencoded({limit: "50mb", extended: false}));
but it has no effect. I've been spending months on this problem, read every article, complaint, issue on the internet regarding this and still, no one knows why it's happening and no one knows how to resolve it.
Can anyone help?
This is a known issue with the S3 plugin. It is fixed in the latest version of Uppy, but Companion is still on an older version. You can use the S3 Multipart plugin directly, which is what Companion uses under the hood.
const Uppy = require('#uppy/core')
const AwsS3Multipart = require('#uppy/aws-s3-multipart')
const uppy = new Uppy()
uppy.use(AwsS3Multipart, {
companionUrl: 'https://companion.uppy.io',
companionCookiesRule: 'same-origin',
limit: 5,
getUploadParameters (file) {
return {
method: 'post',
endpoint: 'https://companion.uppy.io/s3/multipart',
fields: {
filename: file.name,
size: file.size,
contentType: file.type
}
}
}
})
uppy.on('upload-success', (file, data) => {
console.log('Upload successful', file, data)
})
uppy.on('upload-error', (file, error) => {
console.log('Upload error', file, error)
})
uppy.addFile({
name: 'test.txt',
type: 'text/plain',
data: new Blob(['hello world'], { type: 'text/plain' })
})
uppy.upload()
The S3 Multipart plugin is a bit more complicated to use than the S3 plugin, but it is more flexible. It allows you to upload files larger than 5GB, and it allows you to upload files in parallel. It also allows you to upload files to S3-compatible services like Minio. The S3 plugin is a bit simpler to use, but it has some limitations. It doesn’t allow you to upload files larger than 5GB, and it doesn’t allow you to upload files in parallel. It also doesn’t allow you to upload files to S3-compatible services like Minio.

puppeteer When use headless false

I have a script the parse a specific webpage.
When I set headless false, the puppeteer doesn't load page
await page.goto('https://www.google.com', {
waitUntil: 'load',
// Remove the timeout
timeout: 0
});
I tried with a lot of configurations, like:
const args = [
'--no-sandbox',
'--enable-logging',
' --v=1',
'--disable-gpu',
'--disable-extension',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-position=0,0',
'--ignore-certifcate-errors',
'--ignore-certifcate-errors-spki-list',
'--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"'
];
const options = {
args,
headless: false, // default is true
userDataDir: "./user_data",
defaultViewport: null,
devtools: true,
ignoreHTTPSErrors: true,
};
But the script stops to await page.goto until timeout.
As I understand you're parsing not the google.com page.
The first thing that should be considered it's the waitUntil: 'load'. What it done, it considers navigation to be finished when the load event is fired.
The load event is fired when the whole webpage (HTML) has loaded fully, including all dependent resources such as JavaScript files, CSS files, and images.
There is a big chance that this event is not firing in your case for a reasonable timeout, so I would suggest not to rely on this waitUntil but use another wait like the presence of some selector, for example
await page.goto('https://www.google.com');
await page.waitForSelector('[name="q"]');

Next.js dynamic api pages fail to respond to post requests with Content-Type=application/json headers

I've got a next.js react app running on a custom Express server with custom routes. I'm working on this project by myself, but I'm hoping I might have a collaborator at some point, and so my main goal is really just to clean things up and make everything more legible.
As such, I've been trying move as much of the Express routing logic as possible to the built in Next.js api routes. I'm also trying to replace all the fetch calls I have with axios requests, since they look less verbose.
// current code
const data = await fetch("/api/endpoint", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ foo: "bar" })
}).then(x => x.json());
// what I'd like
const data = await axios.post( "/api/endpoint", { foo: "bar" });
The problem I've been having is that the dynamic next.js api routes stall as soon as there's JSON data in the body. I'm not even getting an error, the request just gets stuck as "pending" and the await promise never resolved.
I get responses from these calls, but I can't pass in the data I need:
// obviously no data passed
const data = await axios.post( "/api/endpoint");
// req.body = {"{ foo: 'bar' }":""}, which is weird
const data = await axios.post( "/api/endpoint", JSON.stringify({ foo: "bar" }));
// req.body = "{ foo: 'bar' }" if headers omitted from fetch, so I could just JSON.parse here, but I'm trying to get away from fetch and possible parse errors
const data = await fetch("/api/endpoint", {
method: "POST",
// headers: { "Content-Type": "application/json" },
body: JSON.stringify({ foo: "bar" })
}).then(x => x.json());
If I try to call axios.post("api/auth/token", {token: "foo"}), the request just gets stuck as pending and is never resolved.
The Chrome Network panel gives me the following info for the stalled request:
General
Request URL: http://localhost:3000/api/auth/token
Referrer Policy: no-referrer-when-downgrade
Request Headers
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9,es;q=0.8
Connection: keep-alive
Content-Length: 26
Content-Type: application/json;charset=UTF-8
Cookie: token=xxxxxxxxxxxxxxxxxxxx; session=xxxxxxxxxxxxxxxxxxxxxxx
Host: localhost:3000
Origin: http://localhost:3000
Referer: http://localhost:3000/dumbtest
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36
Request Payload
{token: "foo"}
I've tried looking into what might be causing this, and everything seems to point towards there being an issue with preflight requests, but, since those are related to CORS policies, I don't understand why I'd be encountering those. I'm making a request from http://localhost:3000 to http://localhost:3000/api/auth/token.
Even so, I did try to add cors middleware as shown in the next.js example, but that didn't make a difference. As far as I can tell, the request never even hits the server - I've got a console.log call as the first line in the handler, but it's never triggered by these requests.
Is there something obvious I'm missing? This feels like it should be a simple switch to make, but I've spent the last day and a half trying to figure this out, but I keep reaching the same point with every solution I try - staring at a gray pending request in my Network tab and a console reporting no errors or anything.
After a few more hours searching, I found my answer here
Turns out that since I was using a bodyParser middleware in my express server, I had to disable the Next body parsing by adding this at the top of my file:
export const config = {
api: {
bodyParser: false,
},
}

Mapping font url to font name

How to get font-name based off font-url using puppeteer
I am using Network.requestIntercepted to get list of fonts that are being used on a given website. However, the response does not contain any information about the font family that is being used in the CSS.
Is there a way to get font-family name and the corresponding font url that is being used on the page?
await client.on('Network.requestIntercepted', async e => {
if (e.resourceType == "Font") {
console.log(e)
fontCollection.add(e.request.url)
}
While the response contains font details, it does not contain the font-family name
{ interceptionId: 'interception-job-14.0',
request:
{ url:
'https://fonts.gstatic.com/s/lato/v15/S6uyw4BMUTPHjx4wWyWtFCc.ttf',
method: 'GET',
headers:
{ Origin: 'https://goldrate.com',
'User-Agent':
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/73.0.3679.0 Safari/537.36',
Accept: '*/*',
Referer:
'https://fonts.googleapis.com/css?family=Lato:100,100i,300,300i,400,400i,700,700i,900,900i' },
initialPriority: 'VeryHigh',
referrerPolicy: 'no-referrer-when-downgrade' },
frameId: '4127ABB5A3E704843D0AB4756C7507E4',
resourceType: 'Font',
isNavigationRequest: false }
You have two options:
Guess the font from URL and/or HTTP headers
Download the font file and inspect it
Option 1: Guess the font from URL and HTTP headers
By looking at the request information, you can see the font name at two positions. First, in the URL and second in the Referer:
URL
fonts.gstatic.com/s/lato/v15/S6uyw4BMUTPHjx4wWyWtFCc.ttf
Referer:
fonts.googleapis.com/css?family=Lato:100,100i,300,300i,400,400i,700,700i,900,900i
From that information you can therefore find out which font is being used.
Option 2: Download the font file and inspect it
If the first option is not realiable enough (maybe you want to crawl other pages, too?), you can always download the file by using a tool like node-fetch when intercepting the request
and then parse the meta information of the font file.
The library fontkit is able to parse a ttf file and read its Metadata like familyName or fullName:
Code sample
const fetch = require('node-fetch');
const fontkit = require('fontkit');
(async () => {
const response = await fetch('https://fonts.gstatic.com/s/lato/v15/S6uyw4BMUTPHjx4wWyWtFCc.ttf');
const buffer = await response.buffer();
const font = fontkit.create(buffer);
console.log(font.familyName); // "Lato"
console.log(font.fullName); // "Lato Regular"
})();
You could then do this inside your Network.requestIntercepted block to find out which font is being used.

Slimerjs takes a snapshot of only the visible area

I am doing a screenshot using slimerjs, and if you specify a width smaller than the minimum width of the page - that cuts to the visible region.
This happens only on Facebook.
https://github.com/Samael500/save_screenshot_test/blob/32af68387072a690ebce18b29c973330ac2497b4/img/slimerjs/www.facebook.com-240px.png
But on other sites (such as Google) render the all page.
https://github.com/Samael500/save_screenshot_test/blob/32af68387072a690ebce18b29c973330ac2497b4/img/slimerjs/google.com-240px.png
I make a screenshot with this script
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (X11; Linux x86_64) Gecko/20100101 Firefox/36.0';
page.open('https://www.facebook.com/', function (status) {
page.viewportSize = { width:240, height:768 };
page.render('img.png');
page.close();
slimer.exit();
});
And call it that way
$ ./slimerjs slimer_screen.js --ssl-protocol=any
How to get a snapshot of the facebook page full?
I found a solution. Before opening the page, you must specify the native resolution for the browser.
var page = require('webpage').create();
page.viewportSize = { width:1024, height:768 };
page.settings.userAgent = 'Mozilla/5.0 (X11; Linux x86_64) Gecko/20100101 Firefox/36.0';
page.open('https://www.facebook.com/', function (status) {
page.viewportSize = { width:240, height:768 };
page.render('img.png');
page.close();
slimer.exit();
});
So the solution in line page.viewportSize = { width:1024, height:768 };