Import match from Excel unreliable - filemaker

I have a script set up in Filemaker 11 which should import data from an excel sheet. There's a field containing a unique number in both the Filemaker database and the .xlsx file which is used to match already existing entries. "Update matching records in found set" and "Add remaining data as new record" are both enabled.
Unfortunately, Filemaker seems to behave completely arbitrarily here. Using the same script and the same .xlsx file several times in a row, the results are completely unpredictable. Sometimes already existing records are correctly skipped or updated sometimes they are added a second (or third or fifth …) time.
Is this a bug, maybe specific to version 11, which was sorted out later? Am I missing something about importing?

Here's my official answer to your question:
Imports into FileMaker databases are found set sensitive.
If you're importing records in a method that acts on existing records (update matching), FileMaker uses the found set showing in the layout on your active window to do the matching on rather than the entire table itself.
Regarding it's localness (per your comment above), it allows you to do more precise imports. In a case where you want to make sure you only match for specific records (e.g. you have a spreadsheet of data from employees at company A and you only want to update employee records for company A) you could perform a find and limit the found set to just your desired records before importing with matching turned on. This wat the import will ONLY look at the records in your found set to do it's evaluation. This means less processing because FM has to look at fewer records and also less risk that you're going to find a match that you didn't want (depending on what your match criteria is).
I'm having a hard time finding a good and up-to-date reference for you. All I can find is this one that is form FM10 days on Format. I would suggest bookmarking the FileMaker 13 help pages. It's the same set of help documents available when you use the Help menu in FileMaker Pro, but I find it much easier to search the docs via a browser.

Related

Is there any way to save lots of data on Sharepoint by just using REST api (or any client-side solution)?

I have been asked to develop a web application to be hosted on Sharepoint 2013 that has to work with lots of data. It basically consists in a huge form to be used to save and edit many information.
Unfortunately, due to some work restrictions, I do not have access to the backend, so it has to be done entirely client-side.
I am already aware on how to programmatically create sharepoint lists with site columns and save data on them with REST.
The problem is, I need to create a Sharepoint list (to be used as database) with at least 379 site columns (fields), of which 271 has to be single lines of text and 108 multiple lines of text, and by doing so I think I would exceed the threshold limit (too many site columns on a single list).
Is there any way I could make this work? Any other solution on how to save big amounts of data on Sharepoint, by only using client-side solutions (e.g. REST)? Maybe there is a way to save a XML or JSON file in any way on Sharepoint through REST?
I don't remember if there is actually some limit regarding columns in SP 2013. For sure there is a limit when You would be using lookup columns (up to 12 columns in one view), but since You are using only text columns this should not be blocking... and limit regarding number of rows that may be present in one view (5000 - normal user, 20 000 - admin user)
Best to check all here https://learn.microsoft.com/en-us/sharepoint/install/software-boundaries-and-limits
As described by MS:
The sum of all columns in a SharePoint list cannot exceed 8,000 bytes.
Also You may create one Note column and store all date in JSON structure which will be saved in this column or create Library not list and store JSON documents in list (just be aware that by default JSON format is blocked in SharePoint to be stored in lists and You need to change those settings in Central Admin for application pool -> link). Just be aware that with this approach You will be not able to use many OOB SharePoint features like column filters which might be handy

How to remove words from a document on a column-by-column basis instead of whole lines in word

Perhaps a stupid question but I have a document where I have a large number of numerical values arranged in columns, although not in word's actual column formatting and I want to delete certain columns while leaving one intact. Heres a link to a part of my document.
Data
As can be seen there are four columns and I only want to keep the 3rd column but when I select any of this in word, it selects the whole line. Is there a way I can select data in word as a column, rather than as whole lines? If not, can this be done in other word processing programs?
Generally, spreadsheet apps or subprograms are what you need for deleting and modifying data in column or row format.
Microsoft's spreadsheet equivalent is Excel, part of the Microsoft Office Suite that Word came with. I believe Google Docs has a free spreadsheet tool online as well.
I have not looked at the uploaded file, but if it is small enough, you might be able to paste one row of data at a time into a spreadsheet, and then do your operation on the column data all at once.
There may be other solutions to this problem, but that's a start.

Calculating and reporting Data Completeness

I have been working with measuring the data completeness and creating actionable reports for out HRIS system for some time.
Until now i have used Excel, but now that the requirements for reporting has stabilized and the need for quicker response time has increased i want to move the work to another level. At the same time i also wish there to be more detailed options for distinguishing between different units.
As an example I am looking at missing fields. So for each employee in every company I simply want to count how many fields are missing.
For other fields I am looking to validate data - like birthdays compared to hiring dates, threshold for different values, employee groups compared to responsibility level, and so on.
My question is where to move from here. Is there any language that is better than any of the others when dealing with importing lists, doing evaluations on fields in the lists and then quantify it on company and other levels? I want to be able to extract data from our different systems, then have a program do all calculations and summarize the findings in some way. (I consider it to be a good learning experience.)
I've done something like this in the past and sort of cheated. I wrote a program that ran nightly, identified missing fields (not required but necessary for data integrity) and dumped those to an incomplete record table that was cleared each night before the process ran. I then sent batch emails to each of the different groups responsible for the missing element(s) to the responsible group (Payroll/Benefits/Compensation/HR Admin) so the missing data could be added. I used .Net against and Oracle database and sent emails via Lotus Notes, but a similar design should work on just about any environment.

Data Transformation Help - Variety of Documents - Distinct Fields

Let us say, I want to transfer data from 1 MongoDB cluster with 50 million records to another one where the self-imposed 'schema' has changed drastically and I want to test the import + conversion before actually running it.
I am able to find a list of distinct fields just fine, but I want to pull a variety of documents so that each distinct field is pulled. This data would then be the source to test my Map-Reduce script.
The issue arose due to many years of using and changes in the way of saving the stored data. What originally was user.orgId became user.organizationid.
Any suggestions? Even on 3rd party tools?
Basically it seems like you have two related questions:
How can I run an import and conversion without affecting the final collection.
How can I verify that the documents in a collection match a particular schema definition.
Both questions have a variety of appropriate answers.
For question 1.
a. You can create a temporary duplicate of your cluster: then run your import and conversion in this environment. This is the safest way.
b. You can simply run the import and conversion with a different final collection. This isn't as safe as a, because it requires the developer to be diligent with selecting the appropriate collections at test time, and at final deployment time.
Question 2.
This depends very much on the environment you are developing for, which I don't know anything about. But, for the sake of an example, if you were working in python, you could use something like: https://pypi.python.org/pypi/jsonschema, and iterate over each document confirming that it conforms to the schema you require. If you already have an ODM in place, and have mappings that describe the schema, if should be possible to validate documents using the mapping.

Database design: Postgres or EAV to hold semi-structured data

I was given the task to decide whether our stack of technologies is adequate to complete the project we have at hand or should we change it (and to which technologies exactly).
The problem is that I'm just a SQL Server DBA and I have a few days to come up with a solution...
This is what our client wants:
They want a web application to centralize pharmaceutical researches separated into topics, or projects, in their jargon. These researches are sent as csv files and they are somewhat structured as follows:
Project (just a name for the project)
Segment (could be behavioral, toxicology, etc. There is a finite set of about 10 segments. Each csv file holds a segment)
Mandatory fixed fields (a small set of fields that are always present, like Date, subjects IDs, etc. These will be the PKs).
Dynamic fields (could be anything here, but always as a key/pair value and shouldn't be more than 200 fields)
Whatever files (images, PDFs, etc.) that are associated with the project.
At the moment, they just want to store these files and retrieve them through a simple search mechanism.
They don't want to crunch the numbers at this point.
98% of the files have a couple of thousand lines, but there's a 2% with a couple of million rows (and around 200 fields).
This is what we are developing so far:
The back-end is SQL 2008R2. I've designed EAVs for each segment (before anything please keep in mind that this is not our first EAV design. It worked well before with less data.) and the mid-tier/front-end is PHP 5.3 and Laravel 4 framework with Bootstrap.
The issue we are experiencing is that PHP chokes up with the big files. It can't insert into SQL in a timely fashion when there's more than 100k rows and that's because there's a lot of pivoting involved and, on top of that, PHP needs to get back all the fields IDs first to start inserting. I'll explain: this is necessary because the client wants some sort of control on the fields names. We created a repository for all the possible fields to try and minimize ambiguity problems; fields, for instance, named as "Blood Pressure", "BP", "BloodPressure" or "Blood-Pressure" should all be stored under the same name in the database. So, to minimize the issue, the user has to actually insert his csv fields into another table first, we called it properties table. This action won't completely solve the problem, but as he's inserting the fields, he's seeing possible matches already inserted. When the user types in blood, there's a panel showing all the fields already used with the word blood. If the user thinks it's the same thing, he has to change the csv header to the field. Anyway, all this is to explain that's not a simple EAV structure and there's a lot of back and forth of IDs.
This issue is giving us second thoughts about our technologies stack choice, but we have limitations on our possible choices: I only have worked with relational DBs so far, only SQL Server actually and the other guys know only PHP. I guess a MS full stack is out of the question.
It seems to me that a non-SQL approach would be the best. I read a lot about MongoDB but honestly, I think it would be a super steep learning curve for us and if they want to start crunching the numbers or even to have some reporting capabilities,
I guess Mongo wouldn't be up to that. I'm reading about PostgreSQL which is relational and it's famous HStore type. So here is where my questions start:
Would you guys think that Postgres would be a better fit than SQL Server for this project?
Would we be able to convert the csv files into JSON objects or whatever to be stored into HStore fields and be somewhat queryable?
Is there any issues with Postgres sitting in a windows box? I don't think our client has Linux admins. Nor have we for that matter...
Is it's licensing free for commercial applications?
Or should we stick with what we have and try to sort the problem out with staging tables or bulk-insert or other technique that relies on the back-end to do the heavy lifting?
Sorry for the long post and thanks for your input guys, I appreciate all answers as I'm pulling my hair out here :)