Split Sharepoint Content Database - powershell

I have a single SharePoint content database (Sharepoint 2019 - On Premise) that is over 100 GB and I would like to split the SP sites between some new content databases that I will make.
I have created the new content databases but I have no idea on how to move the subsites to them.
Based on research that I have done, it seems I need to:
Create Content Databases
Create site collections in those databases
Move sub collections into new site collections in the new databases.
Question 1 - are the above steps correct or do I have this wrong?
Question 2 - How in the heck do I move subsites out of the almost full content database, into new content Database? Do I move it to
the site collection in the new database? If so How?!?
Thankyou for your brainpower and help
Tried moving subsites and failed

Unfortunately, I could not understand if you wish to transfer just some subsites or a complete site collection, so I will list below both of these ways.
I would strongly suggest that you create a sandbox environment before proceeding with any of the below scripts, just in case you have misunderstood anything.
Before any transfers are performed, you should create the Content Databases that you will be targeting. You can perform such task either via the Central Admin Panel (GUI) or via a PowerShell Script, of which the command would be the below:
#get web app under which you will create the content db.
$WebApp = Get-SPWebApplication
#create the new content database
New-SPContentDatabase "<Name_of_new_Content_db>" -DatabaseServer "<db_server>" -WebApplication $WebApp
#you can also use the below parchment which points directly to the web app.
#New-SPContentDatabase "<Name_of_new_Content_db>" -DatabaseServer "<db_server>" -WebApplication http://web-app/
In case you wish to transfer whole site collections or clone them on to different Content Databases there are three ways to achieve this.
Copy Site Collection, use the Copy-SPSite cmdlet to make a copy of a site collection from an implied source content database to a specified destination content database.
The copy of the site collection has a new URL and a new SiteID.
Copy-SPSite http://web-app/sites/original -DestinationDatabase <Name_of_new_Content_db> -TargetUrl http://web-app/sites/copyfromoriginal
Move Site Collection, the Move-SPSite cmdlet moves the data in the specified site collection from its current content database to the content database specified by the DestinationDatabase parameter.
A no-access lock is applied to the site collection to prevent users from altering data within the site collection while the move is taking place.
Once the move is complete, the site collection is returned to its original lock state. The original URL is preserved, in contrast with Copy-SPSite where you generate a new one.
As you can see, before executing the below script each content database was hosting at least one site collection.
Move-SPSite http://web-app/sites/originalbeforemove -DestinationDatabase <Name_of_new_Content_db>
After the execution, you can see that a site was transfered from the last content database to the second, preserving its original url.
Backup and Restore Site Collection, this combination will save the site collection on the disk and afterwards restore it onto a new Content Database. The Restore-SPSite cmdlet performs a restoration of the site collection to a location specified by the Identity parameter. A content database may only contain one copy of a site collection. If a site collection is backed up and restored to a different URL location within the same Web application, an additional content database must be available to hold the restored copy of the site collection.
Backup-SPSite http://web-app/sites/original -Path C:\Backup\original.bak
Restore-SPSite http://web-app/sites/originalrestored -Path C:\Backup\original.bak -ContentDatabase <Name_of_new_Content_db>
Once I executed the above commands, a new site was restored on the third Content Database, which was basically a clone of the original site. Keep in mind, that with this path you will preserve the original site and will be able to work on the newly restored copy.
In case you wish to transfer just one Sub Site on to a different Content Databases you can follow the below strategy.
Use the -Force flag in case of the below error.
File C:\Backup\export.cmp already exists. To overwrite the existing file use the -Force parameter.
You can import sites only into sites that are based on same template as the exported site. This is refering to the Site Collection and not the SubSite
Import-SPWeb : Cannot import site. The exported site is based on the template STS#3 but the destination site is based on the template STS#0. You can import sites only
into sites that are based on same template as the exported site.
#Create Site Collection in targeted Content Database first
New-SPSite http://web-app/sites/subsiterestoration2 -OwnerAlias "DOMAIN\user" -Language 1033 -Template STS#3 -ContentDatabase <Name_of_new_Content_db>
#export Web object, use force to overwrite the .cmp file
Export-SPWeb http://web-app/sites/original/subsitetomove -Path "C:\Backup\export.cmp" -Force
#Create a new Web under the new Site Collection, although it is not necessary and you can always restore on to the RootWeb. I created the new Web object just to preserve the previous architecture.
New-SPWeb http://web-app/sites/subsiterestoration2/subsitemoved -Template "STS#3"
#Finally, import the exported Web Object on to the Targeted Web
Import-SPWeb http://web-app/sites/subsiterestoration2/subsitemoved -Path "C:\Backup\export.cmp" -UpdateVersions Overwrite
Final Notes
Keep in mind that all of the transfers were performed on sites that did not have any kind of customizations upon them, like Nintex WFs or custom event receivers. These were just plain sites that several Lists and Document Libraries.
Always make sure that once you are performing the below tasks that the Users are not altering data that currently exist within the site collections in question.
To briefly answer your question, yes you have the correct idea of what there is to be done in case you wish to transfer just the a sub site, but you must pick the best method of the above that suits you.
Always pay attention that most of the methods alter the url which points to a subsite, which you should be cautious about if any other third party automations are getting and updating data on Sharepoint with these urls.
I will try to keep this answer updated with the ways of transfering a subsite, in case anything else comes up.


Use Lookup and For Each Iteration to pull data from different analytics.dev.azure.com projects

Hi would just like to ask if this is possible, I am currently working on ADF, what I want to do is get workitems from analytics.dev.azure.com/[Organization]/[Project] then copy it to SQL Database. i am currently already doing this for 1 project, but want to do it for multiple projects without creating multiple copyto tasks within ADF but just run a Lookup to ForEach to iterate through all the team analytics URLs, is there anyway to do this?
We can use lookup and for-each activity to copy data to SQL dB tables from all URLs. Below are the steps
Create a lookup table which contains the entire list of URLs
Next in for each activity's settings, type the following in items for getting output of lookup activity
Inside for each activity, use copy activity.
In source, create a dataset and http linked service. Enter the base URL and relative URL. I have stored relative URLs in lookup activity. Thus I have given #{item().url} in relative URL
In sink, Create azure SQL database table for each item in for each activity or use the existing tables and copy data to those tables.

Different S3 behavior using different endpoints?

I'm currently writing code to use Amazon's S3 REST API and I notice different behavior where the only difference seems to be the Amazon endpoint URI that I use, e.g., https://s3.amazonaws.com vs. https://s3-us-west-2.amazonaws.com.
Examples of different behavior for the the GET Bucket (List Objects) call:
Using one endpoint, it includes the "folder" in the results, e.g.:
and, using the other endpoint, it does not include the "folder" in the results:
Using one endpoint, it represents "folders" using a trailing / as shown above and, using the other endpoint, it uses a trailing _$folder$:
Why the differences? How can I make it return results in a consistent manner regardless of endpoint?
Note that I get these same odd results even if I use Amazon's own command-line AWS S3 client, so it's not my code.
And the contents of the buckets should be irrelevant anyway.
Your assertion notwithstanding, your issue is exactly about the content of the buckets, and not something S3 is doing -- the S3 API has no concept of folders. None. The S3 console can display folders, but this is for convenience -- the folders are not really there -- or if there are folder-like entities, they're irrelevant and not needed.
In Amazon S3, buckets and objects are the primary resources, where objects are stored in buckets. Amazon S3 has a flat structure with no hierarchy like you would see in a typical file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. Amazon S3 does this by using key name prefixes for objects.
So why are you seeing this?
Either you've been using EMR/Hadoop, or some other code written by someone who took a bad example and ran with it... or is doing something differently than it should have been done for quite some time.
Amazon EMR is a web service that uses a managed Hadoop framework to process, distribute, and interact with data in AWS data stores, including Amazon S3. Because S3 uses a key-value pair storage system, the Hadoop file system implements directory support in S3 by creating empty files with the <directoryname>_$folder$ suffix.
This may have been something the S3 console did many years ago, and apparently (since you don't report seeing them in the console) it still supports displaying such objects as folders in the console... but the S3 console no longer creates them this way, if it ever did.
I've mirrored the bucket "folder" layout exactly
If you create a folder in the console, an empty object with the key "foldername/" is created. This in turn is used to display a folder that you can navigate into, and upload objects with keys beginning with that folder name as a prefix.
The Amazon S3 console treats all objects that have a forward slash "/" character as the last (trailing) character in the key name as a folder
If you just create objects using the API, then "my/object.txt" appears in the console as "object.txt" inside folder "my" even though there is no "my/" object created... so if the objects are created with the API, you'd see neither style of "folder" in the object listing.
That is probably a bug in the API endpoint which includes the "folder" - S3 internally doesn't actually have a folder structure, but instead is just a set of keys associated with files, where keys (for convenience) can contain slash-separated paths which then show up as "folders" in the web interface. There is the option in the API to specify a prefix, which I believe can be any part of the key up to and including part of the filename.
EMR's s3 client is not the apache one, so I can't speak accurately about it.
In ASF hadoop releases (and HDP, CDH)
The older s3n:// client uses $folder$ as its folder delimiter.
The newer s3a:// client uses / as its folder marker, but will handle $folder$ if there. At least it used to; I can't see where in the code it does now.
The S3A clients strip out all folder markers when you list things; S3A uses them to simulate empty dirs and deletes all parent markers when you create child file/dir entries.
Whatever you have which processes GET should just ignore entries with "/" or $folder at the end.
As to why they are different, the local EMRFS is a different codepath, using dynamo for implementing consistency. At a guess, it doesn't need to mock empty dirs, as the DDB tables will host all directory entries.

How to import users in CRM 2011 with source GUID

We have three Organization tenents, Dev, Test and Live. All hosted on premise (CRM 2011. [5.0.9690.4376] [DB 5.0.9690.4376]).
Because the way dialogs uses GUIDs to refference record in Lookup, we aim to maintain GUIDs for static records same across all three tenents.
While all other entities are working fine, I am failing to import USERS and also maintain their GUIDS. I am using Export/Import to get the data from Master tenent (Dev) in to the Test and Live tenents. It is very similar to what 'configuration migration tool' does in CRM 2013.
Issue I am facing is that in all other entities I can see the Guid field and hence I map it during the import wizard but no such field shows up in SystemUser entity while running import wizards. For example, with Account, I will export a Account, amend CSV file and import it in the target tenant. When I do this, I map AccountId (from target) to the Account of source and as a result this account's AccountId will be same both in source and target.
At this point, I am about to give up trying but that will cause all dialogs that uses User lookup will fail.
Thank you for your help,
Try following steps. I would strongly recommend to try this on a old out of use tenant before trying it on live system. I am not sure if this is supported by MS but it works for me. (Another thing, you will have to manually assign BU and Roles following import)
Create advance find. Include all required fields for the SystemUser record. Add criteria that selects list of users you would like to move across.
Save file as CSV (this will show the first few hidden columns in excel)
Rename the Primary Key field (in this case User) and remove all other fields with Do Not Modify.
Import file and map this User column (with GUID) to the User from CRM
Import file and check GUIDs in both tenants.
Good luck.
My only suggestion is that you could try to write a small console application that connects to both your source and destination organisations.
Using that you can duplicate the user records from the source to the destination preserving the IDs in the process
I can't say 100% it'll work but I can't immediately think of a reason why it wouldn't. This is assuming all of the users you're copying over don't already existing in your target environments
I prefer to resolve these issues by creating custom workflow activities. For example; you could create a custom workflow activity that returns a user record by an input domain name as a string.
This means your dialogs contain only shared configuration values, e.g. mydomain\james.wood which are used to dynamically find the record you need. Your dialog is then linked to a specific record, but without having the encode the source guid.

Is it RESTful to match a URI in a database and display associated content via request forwarding?

So, I'm building a Web site application that will comprise a small set of content files each dedicated to a general purpose, rather than a large set of files each dedicated to a particular piece of information. In other words, instead of /index.php, /aboutme.php, /contact.php, etc., there would just be /index.php (basically just a shell with HTML and ) and then content.php, error.php, 404.php, etc.
To deliver content, I plan to store "directory structures" and associated content in a data table, capture URIs, and then query the data table to see if the URI matches a stored "directory structure". If there's a match, the associated content will be returned to the application, which will use Pear's HTTP_Request2 to send a request to content.php. Then, content.php will display the appropriate content that was returned from the database.
EDIT 1: For example:
a user types www.somesite.com/contact/ into their browser
index.php loads
a script upstream of the HTML header on index.php does the following:
submits a mysql query and looks for WHERE path = $_SERVER[REQUEST_URI]
if it finds a match, it sets $content = $dbResults->content and POSTs $content to /pages/content.php for display. The original URI is preserved, although /pages/content.php and /index.php are actually delivering the content
if it finds no match, the contents of /pages/404.php are returned to the user. Here, too, the original URI is preserved even though index.php and /pages/404.php are actually delivering the content.
Is this a RESTful approach? On the one hand, the URI is being used to access a particular resource (a tuple in a data table), but on the other hand, I guess I always thought of a "resource" as an actual file in an actual directory.
I'm not looking to be a REST purist, I'm just really delving into the more esoteric aspects of and approaches to working with HTTP and am looking to refine my knowledge and understanding...
OK, my conclusion is that there is nothing inherently unRESTful about my approach, but how I use the URIs to access the data tables seems to be critical. I don't think it's in the spirit of REST to store a resource's full path in that resource's row in a data table.
Instead, I think it's important to have a unique tuple for each "directory" referenced in a URI. The way I'm setting this up now is to create a "collection" table and a "resource" table. A collection is a group of resources, and a resource is a single entity (a content page, an image, etc., or even a collection, which allows for nesting and taxonomic structuring).
So, for instance, /portfolio would correspond with a portfolio entry in the collection table and the resource table, as would /portfolio/spec-ads; however, /portfolio/spec-ads/hersheys-spec-ad would correspond to an entry only in the resource table. That entry would contain, say, embed code for a Hershey's spec ad on YouTube.
I'm still working out an efficient way to build a query from a parsed URI with multiple "directories," but I'm getting there. The next step will be to work the other way and build a query that constructs a nav system with RESTful URIs. Again, though, I think the approach I laid out in the original question is RESTful, so long as I properly correlate URIs, queries, and the data architecture.
The more I walk down this path, the more I like it...

Where is the Word 2007 schema library stored?

Word 2007 allows XML schemas to be attached to a document (under the Developer toolbar | XML group | Schema button). Where is this schema library information stored?
I have documents that I have created with custom XML tags based on a schema but when I pass on the document and the schema to someone else the schema is marked as unavailable, presumably because the file location of the schema is different.
Is there some way to edit this information to change the path to a given schema?
It's not stored with the docx, just the path to it is stored. So passing a document around will almost always break the link. VSTO can get around this by embedding the XSD as a resource in the app.
But for VBA, it's trickier - you need to have a path you can rely on on each user's computer and then deploy your XSD there. One way is to synch the Document_Open (or just use the AutoOpen) event so that when a user opens the document (warning: macro security needs to be dinked around with), you can simply "write" your XSD that is hard-coded as a string in code-behind and then write it to a file and then attach that file with a routine like:
Dim objSchema As XMLNamespace
Set objSchema = Application.XMLNamespaces.Add("c:\something\mynewlycreated.xsd")
objSchema.AttachToDocument ActiveDocument
So as you're not leaving behind artifacts, you could then delete that XSD from the user's computer on Document_Close or AutoClose.