Full text search on content of documents - mongodb

I need to save several files like pdf, docx, xslx etc..
I also need to make full text search on that files, for example, file named test.pdf that contains three lines:
firt test line
two my test line
three test line
Given an input like 'my' i need to extract file test.pdf
Can i do it with mongoDB or other tools (aws, alfresco?) ?

Related

Neovim and telescope:find files with two strings

I am using telescope with Neovim to find files quickly. However, when I try to input two search strings seperate by space, telescope does not list any files. How can I search for files with two individual substrings in their path?
As Telescope is a fuzzy finder, you don't need to separate search by spaces, just type all together.
For example, if you have a file named some_python_script.py it will appear as a result of the query: somescript

How to extract texts from images in a folder

My laptop is Ubuntu, I have a folder called testdata which contains lots of jpg images, I want to know how to run Tesseract on all these images and save outputs to another folder e.g. "testresult". Outputs can be a single txt file contains extract texts from all images. Or, it can be one txt file for one image extracted text only.
For a single image, the command line I know is tesseract test_01.jpg test_01
Could anyone help me please?

Extracting data from complex output text file using perl and placing into new text file

The complete output text file is hundreds of lines long, with relevant nuclear cross sections and a plethora of other data that I do not need for this particular problem. I am trying to extract the columns of data under "BURNUP" and the first "K-INF" from the file I attached. I am trying to extract this data and place it into a separate file. I am a newbie, and have a similar perl script from a professor. I have tried to adapt it to the information I am looking for but the only result I am receiving are the 2 print statements. Any suggestions?

Search multiple strings in eclipse with AND operation

I need to search throughout the work-space for all files containing "file1.txt" AND "file2.txt" as a string. How to achieve this in eclipse search?
possiblly duplicate of Search multiple strings in eclipse but not worked with file names and AND operation.
As far as I know, you can't, but as a workaround :
You could search for file name patterns file?.txt and then in the search view, click on the down arrow icon on the right, display as list and only doubleclick file1.txt and file2.txt matches
You will do two individual searches, one for file1.txt and one for file2.txt, and you will intersect the two results.
Step 1. Export the search results
As pointed out here , you can install the Eclipse Search CSV Export plugin. In the Marketplace, search for 'csv export' - it is the first result. Once installed, just perform a File Search as usual; in the Search Tab, in the bar, click on this icon (screenshot) named 'Export Search Results as CSV' to export the results as a csv file.
Obviously, you will perform a second search, and export those results, too.
Step 2. Convert the .csv file to .xlsx format
As shown here,in the section 'How to import CSV to Excel' , open Excel, go to Data tab > 'From Text', browse the .csv file; in the wizard, choose 'Delimited' (instead of 'Fixed width'), as Delimiters, choose 'Comma', preview the columns and click OK.
Now that you have converted the data into a more readable format, you can also Save as > .xlsx
Two columns are important here, Path and Location. I prefer Location, as it is an absolute location, not project relative.
Sort this column, select all its content, paste the content to a .txt file, save the .txt file. Do the same for your second search.
Step 3. Create the intersection of the lists of the Locations
At the previous step, you have copied the Locations column into a separate file, for each individual search. Navigate to Venn webtools , browse the .txt files and click Submit.The output will show the Intersection and the Differences between the two files. The Intersection represents the Locations of the files that contain BOTH your search strings - a functionality that should have been offered by the IDE. Please note that this tool allows up to 3 files to be uploaded.

Display contents of text file in MATLAB shell

I'm using MATLAB under Windows, and trying to display (dump) the contents of a text file in the command shell. It seems like overkill to open a small file in the editor, or to load the file to use disp.
Use type and specify the explicit file name (including the extension), for instance:
type('myfile.txt')
As well as type, there's also dbtype which lets you pick a start and end range to print, and shows line numbers - handy for listing source files.