Batches with BulkWriter in google firestore - google-cloud-firestore

Does anyone know why this does not work, what am I doing wrong here. It get stuck after the console.log "after read stream"
I am trying to read a bunch of files, convert it to json and upload with bulkwriter to firestore.
After each 400 document I am calling close to write them to firestore and then I am creating a new bulkwriter
I also tried awaiting bulkWriter.create(eventDoc, {}) but it does not work. It also get stuck and there is no error. Why is this ? the create method returns a promise.
Why can't it be awaited ?
https://googleapis.dev/nodejs/firestore/latest/BulkWriter.html#create
The idea is to process 1 file at the time and it can contains tens of thousands of rows which needs to be uploaded to firestore
I am calling this method in for...of loop and awaiting the processBatch method
Any help highly appreciated
async processBatch(document: string, file: string): Promise<void> {
const db = admin.firestore();
console.log('start: ', document);
let bulkWriter;
const writeBatchLimit = 400;
let documentsInBatch = 0;
let totalInDocument = 0;
const eventsCollectionRef = db.collection('events');
const eventDoc = eventsCollectionRef.doc(document);
return new Promise((resolve, reject) => {
console.log('promise');
bulkWriter = db.bulkWriter();
const csvStream = fs.createReadStream(file);
console.log('after read stream');
bulkWriter.create(eventDoc, {})
.then(result => {
console.log('Successfully: ', result);
csvStream.pipe(csvParser())
.on('data', row => {
console.log('row');
bulkWriter.create(eventDoc.collection('event').doc(), row);
documentsInBatch++;
if (documentsInBatch > writeBatchLimit) {
bulkWriter.close();
totalInDocument = + documentsInBatch;
documentsInBatch = 0;
bulkWriter = db.bulkWriter();
}
})
.on('end', () => {
console.log('file: ', file + ', totalInDocument: ', totalInDocument);
resolve();
});
})
.catch(err => {
console.log('Failed: ', err);
reject();
});
});
}

This seems to work:
async processBatch(document: string, file: string): Promise<void> {
const db = admin.firestore();
console.log(`start: ${document}`);
let bulkWriter;
const writeBatchLimit = 400;
let documentsInBatch = 0;
let numOfBatches = 0;
let totalInDocument = 0;
const eventsCollectionRef = db.collection('events');
const eventDoc = eventsCollectionRef.doc(document);
bulkWriter = db.bulkWriter();
const csvStream = fs.createReadStream(file);
bulkWriter.create(eventDoc, {});
csvStream.pipe(csvParser())
.on('data', row => {
bulkWriter.create(eventDoc.collection('event').doc(), row);
documentsInBatch++;
if (documentsInBatch > writeBatchLimit) {
numOfBatches++;
totalInDocument += documentsInBatch;
documentsInBatch = 0;
bulkWriter.close();
console.log(`Committing batch ${numOfBatches}, cumulative: ${totalInDocument}`);
bulkWriter = db.bulkWriter();
}
})
.on('end', () => {
console.log(`file: ${file}, totalInDocument: ${totalInDocument}`);
});
}

Related

Javascript class how to make it, with a constructor?

Also what are these:
`from selenium import webdriver`
\`def hladaj(vyraz):
driver = webdriver.Chrome()
driver.get('https://www.bing.com/')
search_bar = driver.find_element_by_name('q')
search_bar.send_keys(vyraz)
search_bar.submit()
def otvor(n):
result_links = driver.find_elements_by_css_selector('.b_algo a')
result_links\[n-1\].click()
class Laptop {
constructor(vyrobca, model, rocnik) {
this.vyrobca = vyrobca;
this.model = model;
this.rocnik = rocnik;
}
vypis() {
return `${this.vyrobca},${this.model},${this.rocnik}`;
}
}\`
`<script> // Ziskanie tlacidla var button = document.getElementById("remove"); // Pridanie event listenera na stlacenie tlacidla button.addEventListener("click", function() { // Ziskanie vsetkych elementov s triedou "col" var elements = document.getElementsByClassName("col"); // Prechod cez vsetky elementy for (var i = 0; i < elements.length; i++) { // Zistenie textu v elemente var text = elements[i].textContent; // Ak element obsahuje retazec "Row" if (text.includes("Row")) { // Zmazanie elementu elements[i].parentNode.removeChild(elements[i]); } } }); </script>`
`function o2a(obj){ let final= []; final = Object.keys(objs).map(key=> { let arr = []; arr.push(key); arr.push(obj[key]); return arr; }) return final; }`
`function a2o(arr){ let obj = {} arr.forEach(item=> { obj[item[0]] = item[1]; }); return obj; `
\`import React, { useState } from "react";
function Inp() {
const \[value, setValue\] = useState("");
return (
\<input
value={value}
onChange={(e) =\> setValue(e.target.value)}
placeholder="Input value"
/\>
);
}
function But() {
const \[clicked, setClicked\] = useState(false);
return (
\<button onClick={() =\> setClicked(true)}\>Submit\</button\>
);
}
function Out({ value }) {
return \<div\>{value}\</div\>;
}
function App() {
const \[inputValue, setInputValue\] = useState("");
const handleClick = () =\> {
setInputValue(inputValue);
};
return (
\<div\>
\<Inp /\>
\<But onClick={handleClick} /\>
\<Out value={inputValue} /\>
\</div\>
);
}
export default App;\`
Jazyk javascript bezi defaultne na jednom jadre. Na rozhodnutie toho kedy spusti ktoru funkciu, pouziva event Loop.
Multi-threaded funkcionalitu vieme dosiahnut ponocou builtin API ktore sa nazyva Webliorker. Tento API umoznuje beh paralelnych procesov.
A este tab a browser maju vlastne javascript thready a tak aj prehliadaci je multi thread
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8084 });
let counter = 0;
setInterval(() => {
counter += 2;
}, 2000);
wss.on('connection', (ws) => {
console.log('Connected');
setInterval(() => {
ws.send(counter.toString());
}, 2000);
});
const WebSocket = require('ws');
const ws = new WebSocket('ws://localhost:8084');
ws.on('message', (message) => {
const counter = parseInt(message);
const span = document.createElement('span');
span.textContent = `Aktualny stav pocitadla: ${counter}`;
document.body.appendChild(span);
});
1.prehliada precita kod ako text
2. parsuje to na AST (Abstract Syntax Tree)
3.optimalizuje to
4. vnutorne si ho za behu to kompiluje do bytecode a za behu kontroluje. Pocas kontroly sa engine pozera ze ktore funkcie su ako casto volane, alebo vypoctovo narocne a podla toho ich optimalizuje
const http = require('http');
const fs = require('fs');
const server = http.createServer((req, res)=>{
if (req.method !== 'GET') {
res.end({"error": "error"}")
}else{
if (req.url === '/indexFile'){
fs.readFile('./index.html', function (err, html) {
if (err) {
throw err;
res.writeHeader (200, "Content-Type": "text/html"});
res.write(html);
}
res.end();
if(req.url === '/response'){
const data = fetch('http://esh.op/order').then(res => res.data);
res.writeHead (200, {'Content-Type': 'application/json'});
res.end(JSON.stringify(data));
}
}
});
server.listen(8080)
function conv(str){
str = str.replaceAll(',',"");
str = str.replaceAll(';', "");
str = str.replaceAll('.',"");
return str.split('').sort((a,b) => b.localeCompare(a))
}

how to update a collection if you already called it MongoDB Mongoos

Ok so I have a problem in which I use a collection to gather some ratings data and work with it, by the time I finish the rating update process, I have new ratings that I would like to update the collection with. However I can't call update because I get the error "Cannot overwrite model once compiled." I understand that I already called once the model to work with the data and that's why I get the error. is there any way I can update the collection? Or I will just have to workaround by creating a new collection with the latest rating, and then matching the latest ratings collection with the one I use to work with the data.
This is my code
let calculateRating = async () => {
const getData = await matchesCollection().find().lean();
const playerCollection = await playersCollection();
const getDataPlayer = await playerCollection.find().lean();
let gamesCounting = [];
getDataPlayer.forEach((player) => {
player.makePlayer = ranking.makePlayer(1500);
});
for (let i = 0; i < getData.length; i++) {
const resultA = getDataPlayer.findIndex(({ userId }, index) => {
if (userId === getData[i].userA) {
return index;
}
});
const resultB = getDataPlayer.findIndex(
({ userId }) => userId === getData[i].userB
);
const winner = getData[i].winner;
if (getDataPlayer[resultA] === undefined) {
continue;
} else if (getDataPlayer[resultB] === undefined) {
continue;
}
gamesCounting.push([
getDataPlayer[resultA].makePlayer,
getDataPlayer[resultB].makePlayer,
winner,
]);
}
ranking.updateRatings(gamesCounting);
let ratingsUpdate = [];
getDataPlayer.forEach((item) => {
let newRating = item.makePlayer.getRating();
let newDeviation = item.makePlayer.getRd();
let newVolatility = item.makePlayer.getVol();
item.rating = newRating;
item.rd = newDeviation;
item.vol = newVolatility;
ratingsUpdate.push(item);
});
};
I try the work around with creating the new collection

ag-grid continues to show the loading icon when no data returned from the server

I'm facing a strange behavior in Ag-grid (Angular). When I use the serverSide option and when there is no data returned from the server, the grid is showing the loading icon for all the rows mentioned in the cacheBlockSize. I've tried as many options I could to hide these empty loading rows, but nothing has worked out.
I've tried to replicate the same in the official example page. Luckily I could replicate the similar behavior. Refer to this edited version of an official example page, where I'm passing an empty array from the fake server call:
https://plnkr.co/edit/Egw9ToJmNE7Hl6Z6
onGridReady(params) {
this.gridApi = params.api;
this.gridColumnApi = params.columnApi;
this.http
.get('https://www.ag-grid.com/example-assets/olympic-winners.json')
.subscribe((data) => {
let idSequence = 0;
data.forEach((item) => {
item.id = idSequence++;
});
const server = new FakeServer(data);
const datasource = new ServerSideDatasource(server);
params.api.setServerSideDatasource(datasource);
});
}
}
function ServerSideDatasource(server) {
return {
getRows: (params) => {
setTimeout(() => {
const response = server.getResponse(params.request);
if (response.success) {
params.successCallback(response.rows, response.lastRow);
} else {
params.failCallback();
}
}, 2000);
},
};
}
function FakeServer(allData) {
return {
getResponse: (request) => {
console.log(
'asking for rows: ' + request.startRow + ' to ' + request.endRow
);
const rowsThisPage = allData.slice(request.startRow, request.endRow);
const lastRow = allData.length <= request.endRow ? data.length : -1;
return {
success: true,
rows: [],
lastRow: lastRow,
};
},
};
}
The screenshot of the plunker output is given below.
Just figured out that its a problem with lastRow value. If the rows are empty but lastRow is not -1, then its trying to load the data and showing the loading icon for all the rows as per the cacheBlockSize.
Fixed code below:
function FakeServer(allData) {
return {
getResponse: (request) => {
console.log(
'asking for rows: ' + request.startRow + ' to ' + request.endRow
);
let data = []; //allData;
const rowsThisPage = data.slice(request.startRow, request.endRow);
const lastRow = data.length <= request.endRow ? data.length : -1;
return {
success: true,
rows: rowsThisPage,
lastRow: lastRow,
};
},
};
}
Update for AG Grid 28:
function FakeServer(allData) {
return {
getResponse: (params) => {
const request = params.request;
console.log(
'asking for rows: ' + request.startRow + ' to ' + request.endRow
);
let data = []; //allData;
const rowsThisPage = data.slice(request.startRow, request.endRow);
const lastRow = data.length <= request.endRow ? data.length : -1;
params.success({ rowData: rowsThisPage, rowCount: lastRow });
},
};
}

How to simplify the search query parameter?

The problem
I have a movie database with the indexName: 'movies'.
Let's say my query is John then the domain is domain.tld/?movies[query]=John.
I want to simplify the search query parameter to domain.tld/?keywords=John. How can I do that?
What I already know
After reading through the docs I know that I have to modify the createURL and the parseURL somehow:
createURL({ qsModule, location, routeState }) {
const { origin, pathname, hash } = location;
const indexState = routeState['movies'] || {};
const queryString = qsModule.stringify(routeState);
if (!indexState.query) {
return `${origin}${pathname}${hash}`;
}
return `${origin}${pathname}?${queryString}${hash}`;
},
...
parseURL({ qsModule, location }) {
return qsModule.parse(location.search.slice(1));
},
After some try and error here is a solution:
createURL({ qsModule, location, routeState }) {
const { origin, pathname, hash } = location;
const indexState = routeState['movies'] || {}; // routeState[indexName]
//const queryString = qsModule.stringify(routeState); // default -> movies[query]
const queryString = 'keywords=' + encodeURIComponent(indexState.query); // NEW
if (!indexState.query) {
return `${origin}${pathname}${hash}`;
}
return `${origin}${pathname}?${queryString}${hash}`;
},
...
parseURL({ qsModule, location }) {
//return qsModule.parse(location.search.slice(1)); // default: e.g. movies%5Bquery%5D=john
const query = location.search.match(/=(.*)/g) || []; // NEW
const queryString = 'movies%5Bquery%5D' + query[0]; // NEW
return qsModule.parse(queryString); // NEW
},

Github api get last Users that committed

I want to get the last users exe. last 100 users that committed on github regardless of repo. I've looked around the github api but can't find the specific api call.
You can use Github Events API and filter PushEvent :
https://api.github.com/events?per_page=100
User(s) who have made the last 100 commits on Github
As a PushEvent may have multiple commits, you will have to sum the size for each PushEvent until you reach 100. Note that you also need to exclude PushEvent with 0 commit. You will also have to manage pagination as you can request 100 events max at once (if one request is not enough to get 100 commits).
An example using nodeJS :
var request = require("request");
const maxCommit = 100;
const accessToken = 'YOUR_ACCESS_TOKEN';
function doRequest(page){
return new Promise(function (resolve, reject) {
request({
url: 'https://api.github.com/events?per_page=100&page=' + page,
headers: {
'User-Agent': 'Some-App',
'Authorization': 'Token ' + accessToken
}
}, function (err, response, body) {
if (!err) {
resolve(body);
} else {
reject(err);
}
});
})
}
async function getEvents() {
var commitCount = 0;
var page = 1;
var users = [];
while (commitCount < maxCommit) {
var body = await doRequest(page);
var data = JSON.parse(body);
var pushEvents = data.filter(it => it.type == 'PushEvent' && it.payload.size > 0);
commitCount += pushEvents.reduce((it1, it2) => it1 + it2.payload.size, 0);
users = users.concat(pushEvents.map(event => ({
login: event.actor.login,
commitCount: event.payload.size
})));
page++;
}
var count = 0;
for (var i = 0; i < users.length; i++) {
count += users[i].commitCount;
if (count >= maxCommit){
users = users.slice(0, i + 1);
break;
}
}
console.log(users);
}
getEvents();
Last 100 Users who have pushed commits on Github
The only things that changes is that we only check that size field is > 0 and build a map for distinct user.
An example using nodeJS :
var request = require("request");
const maxUser = 100;
const accessToken = 'YOUR_ACCESS_TOKEN';
function doRequest(page){
return new Promise(function (resolve, reject) {
request({
url: 'https://api.github.com/events?per_page=100&page=' + page,
headers: {
'User-Agent': 'Some-App',
'Authorization': 'Token ' + accessToken
}
}, function (err, response, body) {
if (!err) {
resolve(body);
} else {
reject(err);
}
});
})
}
async function getEvents() {
var page = 1;
var users = {};
while (Object.keys(users).length < maxUser) {
var body = await doRequest(page);
var data = JSON.parse(body);
var pushEvents = data.filter(it => it.type == 'PushEvent' && it.payload.size > 0);
for (var i = 0; i < pushEvents.length; i++) {
users[pushEvents[i].actor.login] = pushEvents[i].payload.size;
if (Object.keys(users).length == maxUser) { 
break;
}
}
page++;
}
console.log(users);
}
getEvents();