Data modeling for multilingual content in a CMS?

Data modeling for multilingual content in a CMS? - content-management-system

Is there a generally accepted solution for storing multilingual content in a database? The company I used to work for had me build a proprietary CMS and they wanted the possibility to support languages dynamically. I was pretty green then so I had a table "Languages" to hold languages and a table "Content" that held the tombstone data (published_datetime, modified_datetime, expiry_datetime, etc.).
To hold the actual content I had a table called "ContentBody" which had a columns: language_id, content_id, title, and content.
This solution worked but I didn't really bother looking more into it. I currently find myself with a lot of free time on my hands and decided I'd dabble again in CMS development and this is one of those aspects of it that I always felt I hadn't done right. I looked at the WordPress ERM diagram and it doesn't seem to have a table for multilingual content.
Any advice or comments would be awesome :)

Having a content table with compound key content_id + language_id is the approach I would follow, assuming your requirements are exactly as you stated. It's that simple unless you have other CMS requirements (e.g. versioning, workflow, approval process, etc) that would complicate it.

Related

Can we automate migrating to SDL Tridion?

We are done migrating a website from old CMS to SDL Tridion. We have thousands of clients out of which fewer than five are migrated. Now let's say we need to automate migrating the rest of the thousands clients, obviously we can not use manual effort. Is there a way to develop automated solution against SDL using any APIs it may provide? If yes where can we find documentation for APIs? Any Books or online tutorials for the same?

all very technical answers. Whatever route you choose you need to weigh up the option of not doing a technical migration (and trying to get that right) versus employing a load of students to copy and paste.

Regardless of the CMS, the complexity of a migration can be measured based on how organized is your content in the system you want to migrate from.
I categorize the migration into 3 types related to the Origin and Destination:
1--> CMS to CMS
2--> Database to CMS
3--> WebSite to CMS
If the original source is a database or another CMS typically the complexity is reduced, as the content is already structured.
You have to extract that and map the existing content with the structure that will have in the new system
If the goal is migrate an existing website into a CMS the complexity increases as the content is more disorganized that
having that in the CMS.
Again, if the content in the site is properly structured is still possible to automate that, but most of the cases are old sites
maintained manually.
There are commercial tools that crawl the content from the sites and apply patterns to identify common elements, common content, common metadata, structure
and are able to massage the original content and apply logic based on rules that allows to structure the content, however even the best tool has a hard
work to do when the source is disorganized.
Also I have seen migrations that cut the final html in pieces and put that in the CMS. That is an easy approach but of course a wrong one, as
you are not taking any advantage of the CMS
And 3 Types related the source type we migrate from and the source type we want to obtain
1--> Content to Content
2--> (HTML + Content All together) into (HTML) + (Content) separated
3--> (HTML + Content + Code All together) into (HTML) + (Content) +
(Code) separated
Content to Content Migration is less complex
Second option is of course more complex, as you have to Separate Content and HTML that will become templates
Third option is even more complex, as if you are extracting the html of the page (using an http client for instance as most of the commercial tools do),
you are not capturing the logic of the page. For this case you need to work at the file level
Try to do a very depth analysis before you enter in a migration, as things can turn complex.
Only if you have a very good knowledge of the original system and solid patterns to apply you can think in an automation

Tridion has extensive APIs and these are thoroughly documented. Your starting point for SDL Tridion 2011 is https://www.sdltridionworld.com/downloads/documentation/SDLTridion2011SP1/index.aspx
Automated migrations are perfectly possible, however API support is not the limiting factor here. Understanding your data in your source and target scenarios is much more important.

I would consider contacting Kapow or Vamosa who both specialize in crawling sites and then importing them to a CMS. They both have connectors for SDL Tridion. This may save your clients both time and money.

Every migration is different, unless you are migrating "thousands of" sites (assuming a client is a site) from same source type to same destination (SDL Tridion in this case) with extremely close data models. Several SDL Tridion partners are already solving this problem and built/building assisted migration automation tools. Get in touch with us if you need more information.

Would there be any benefit to using redis/nosql over postgres for my bug tracking application?

I'm making a site to document browser bugs where users can submit a bug and users can submit solutions/workarounds to these bugs. I'll have stuff like:
screenshots of bugs
browser rendering engines
browsers
tags for each bug
bug categories ( css, html, js )
solutions per bug which include code snippets
usual date/time, author, date modified
Since I'm just starting this site, I won't really need to scale off the bat. I'm just wondering if the data is more ideal for something like redis, or should I stick with rdbms ( in my case, Postgres )?

Bug information revolves around products and users, and that data benefits from relational structure. (You can look at a host of existing bug trackers for examples). If you do find you'd need hierarchical data structures (like redis leans toward), there are several different implementations of tree structures in traditional sql, and postgres offers some additional constructs like arrays and ltree structures. Additionally, Postgres has fairly proven methods for storing binary data (like screenshots) and large text data, that depending your nosql engine might not be as stable as you'd hope. I guess there might be some benefit of learning another system (otoh, others woul argue learning your existing tools better is more beneficial), but from a technical standpoint there isn't really an advantage.

MySQL as well Postgress development teams do not recommend storing images and binary data inside the database.
Instead you can store the images in some directory, and filename can be either the ID from the database, or md5(ID + secret) if you worry people may "hack" the system and see images they must not see.
Doing this you will benefit with smaller database also faster access - you can serve the images directly with your webserver.
I am huge Redis fan, but this project looks more like RDBMS for me.

Searching for a document format.. flowing layout + page control

I am bouncing around the idea of creating a custom document versioning system to use on business rule manuals. These manuals are broken up into outlined sections which contain one rule per section which are outlined in various ways (1.1, 1.2, etc). There are many manuals which contain the same rule for different locations in the country (down to the state/county level), however many locations will have different versions of the rules depending on business needs or whatnot.
My thought is to create a system which will manage versions of each section/rule separately. This would make the management of this mess much easier to maintain (think hundreds of manuals times hundreds of rules), and it would make fielding query requests from management much quicker.
Ok, it's a fairly easy and straightforward design to this point. Now for the monkey wrench. These rules are regulated by government agencies, so they must be submitted to and approved by state agencies. In doing this, many states require only the exact pages which are updated for each request to be submitted for approval. Once they are approved, these pages will get a new effective date and the rest of the manual will remain the same. There are business reasons for this process.
So my choice of document format has to allow for flowing layout much like Word, however I need to be able to programatically determine the page range of these sections and if changes or additions will cause a repagination.
The most complex layout will contain only tables, headers/footers, and a table of contents. I have thought about using OOXML, but I don't see a way to determine pagination without loading Word which is something I would prefer to avoid. I could create my own pagination algorithm, but that sounds a lot like reinventing the wheel.
Can anyone offer pointers to a solution whether it is an open document format, a book, or something else? Thank you for taking the time to read this.

If you want a truly modular document, then DocBook might be worth a look. You have all the rich formatting you need but it does need a bit of work. It really depends on who's doing the authoring and what tools they're comfortable using. DocBook is a rich mark-up language and you can do anything from work in the base plain text file or look at a number of WYSIWYG editors, e.g. ArborText.
It's not Word though - which might be enough to put your authors off!
If you did go with DocBook, you would maintain each document section in a separate text file so your versioning solution would work well. DocBook can produce output in a number of formats simultaneously so you could have an HTML version, an OOXML version, and a PDF version produced from the same source. A PDF version of each changed section might be appropriate to send to government agencies for approval.
On pagination, you could make life a lot easier for yourself by not having continuous page numbers. Use section or chapter based page numbering, e.g. page I-1, I-2, ..., II-1, II-2.

How to create a deliverable for a front-end engineer?

This is a question about the development workflow of front end engineers. I am starting a project for a rather large site with lots of pages, each page has multiple steps, and it's very difficult to lay out all the content in a spreadsheet.
The content of each page will be delivered in a spreadsheet cell, and some pages have multiple variable section that are determined by user's preferences.
I was asked my opinion about how to structure the deliverable. I am wondering if there is a best practice out there for structuring this kind of deliverable? Because when you have a poorly structured deliverable it can be almost as mindnumbing as using pen-and-pencil to write code.
Do you have any tools, formats, practices for creating deliverables that are easy to work with?

It sounds like you are just doing the UI design and then giving it to the front-end engineers.
If that is correct, I would suggest that you see if you can do the rough html/css work to get the page to look as you want, and then they can go in and give it the functionality, but that way you have an idea what is possible.
You can do much of the work, then leave comments about trying to center something a bit better, for example.
I am not a big fan of just getting the design on paper or as an image, it would be easier to just get the html/css.
There are plenty of tools now that make css and html easy to do, even if you have the css inside the html, they can separate the two, but, it would be a huge help to the designers.
Just do one page, and give it to them, and then come back in a day or two and get feedback as to what their thoughts are, and how you can improve what you give them.
As you go through this process, after a while both groups will know what to expect and you can get the rest done quickly.
This is more of an agile methodology with the front-end engineers as your customers.

My suggestion would be mockups or wireframes for the pages. Mockups would be examples of the pages in various states while the wireframe is a detailed document of the structure of the page.

HTML and CSS is way too complicated for mockup use. I usually first create a requirement backlog for UI/functionalities as well (just a list of priorized reqs in Excel).
Especially for a large site development you should also have the process and data flow definitions done (UML or other way of description) to help you define the mentioned requirements.
Based on these you will know what kind of steps does the whole site funcionality need (i.e. pages) and what the page hierarchy and structure will be like. This way it's much easier to get a grasp of the whole thing.
After that we'll create fast wireframes and visualize the end result with fast mockups done as images with Photoshop or similar. These are absolutely vital in my experience as it helps the customer (and other stakeholders) to actually understand what is beind done. For this the html and css are simply too slow to run multiple iterations with.

What is the minimum specification which justifies the name Content Management System?

I've written some simple software which helps me manage and disseminate engineering data on a company intranet. It's pretty flexible about adapting to new content and I wonder if it justifies the description 'Content Management System.
A previous question: how to define content management did a pretty good job of defining a CMS, but I've a feeling my approach fails to reach the bar.
What is the minimum set of features considered essential in a Content Management System, and are there names for subsets of these features?
For example, I've seen some software described as a 'dashboard'. Is this a subset of a CMS?
I'm not really interested in testimonials for other CMS solutions.

It's a bit like Jazz, if you have to ask it's ain't ...
To my mind discussions about such terminology tend to be in the Marketing space. If your software is doing something useful, who cares what it is, or more to the point what label you put on the tin?

Came across a simple definition from a text from what you could possible consider an 'other CMS solution', but we web-frameworkers tend to have bizarre views on CMSs.
Content management systems (CMS)
let users create and edit pages on a
site dynamically through a web-based
interface. Sometimes called
brochureware site because they tend to be used in the same fashion as
traditional printed brochers handed
out by businesses.
Practical Django Projects, 1st ed. James Bennetts
http://www.apress.com/book/preview/9781590599969
Not the final answer, but one definition.

There are two ways to look at it. What is the name: "Content Management System". You could argue that if it is a system to manage content, it's a content management system (small letters). The other way to look at is user expectation. What does a test group of representative users or developers in your target audience expect when they hear CMS? Editing the textual content of a website comes to mind in this case.
If you want to provide a description useful to a broader audience, you have to understand their expectations. If your own interpretation is that those expectations would be unfulfilled, you might come up with a more specific label. Perhaps Engineering Data Management System, or something more specific to your purpose. I think you will be much happier with this.
Lastly, if you need to categorize it on some form of public resource website, you might have to go up or laterally from an existing CMS category. Or, use the category, but a more specific label for the product itself.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse