excel to xml conversion using ado - ado.net

How to convert excel data into xml file using ado.net?

You can use the Microsoft Jet OLEDB 4.0 Data Provider to read the Excel file. Information about how to establish a connection to an Excel file can be found here.
This article explains how to read an Excel file using the provider. Once you have read the data, you can compose your XML document using LINQ to XML or the System.Xml classes.

In Excel, you can save the file to XML by using the File menu and changing the saved file type to XML spreadsheet.
If you want to read an Excel XML file with ADO.Net, try the XmlReader.
Or see this step-by-step example from Microsoft.

I've not used ado.net, but I've used xquery very successfully for this. Use excel export to create an XML file, then write xquery/xpath commands to convert as you want. Excel XML export format is pretty gnarly but it does do the job. Use the Oxygen 30 day eval license to lighten the xquery debug job.

use this code :
public static DataSet exceldata(string filelocation)
{
DataSet ds = new DataSet();
OleDbCommand excelCommand = new OleDbCommand();OleDbDataAdapter excelDataAdapter = new OleDbDataAdapter();
string excelConnStr = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" + filelocation + "; Extended Properties =Excel 8.0;";
OleDbConnection excelConn = new OleDbConnection(excelConnStr);
excelConn.Open();
DataTable dtPatterns = new DataTable();excelCommand = new OleDbCommand("SELECT UUID, `PATTERN` as PATTERN, `PLAN` as PLAN FROM [PATTERNS$]", excelConn);
excelDataAdapter.SelectCommand = excelCommand;
excelDataAdapter.Fill(dtPatterns);
dtPatterns.TableName = "Patterns";
ds.Tables.Add(dtPatterns);
return ds;
}
and then convert returned datatable to xml with DataTable.WriteXml()

Related

Aspose.Cells - Opening OpenXml file generated by EPPlus

I generated an Excel spreadsheets in OpenXml format using EPPlus. When I try to open it using Aspose.Cells, the program returns the following error:
Aspose.Cells.CellsException: ‘Invalid cell name’
I believe the cells didn’t find the correct file format.
How can I open this file?
I am using C# and .NET Core
Thanks
I am Ahsan Iqbal from Aspose.
I have tested this issue with my own sample code but this issue is not reproduced.
static void Main(string[] args)
{
Console.WriteLine("Creating XLSX file using EPPlus");
CreateExcelFileInEpplus();
Console.WriteLine("Success in creating XLSX file using EPPlus");
ReadExcelFileUsingAsposeCells();
Console.WriteLine("Success in reading XLSX file using EPPlus");
}
static void CreateExcelFileInEpplus()
{
var newFile = new FileInfo("output.xlsx");
if (File.Exists(newFile.Name))
File.Delete(newFile.Name);
using (ExcelPackage xlPackage = new ExcelPackage(newFile))
{
xlPackage.Workbook.Worksheets.Add("Sheet1");
// do work here
xlPackage.Save();
}
}
static void ReadExcelFileUsingAsposeCells()
{
Aspose.Cells.Workbook workbook = new Aspose.Cells.Workbook("output.xlsx");
Console.WriteLine(workbook.Worksheets[0].Name);
}
This test shows that Aspose.Cells can read the Excel spreadsheet generated by EPPlus. Your sample code and output files are required for analysis. You may please create a thread at Aspose.Cells Forum and share your code, output XLSX file, version of Aspose.Cells and .NET Core, operating system and other specific environment information if any. Our support team will assist you further in this regard.

Is there a way to read an Excel file using Dataflow

Is there a way to read an Excel file stored in a GCS bucket using Dataflow?
And I would also like to know if we can access the metadata of an object in GCS using Dataflow. If yes then how?
CSV files are often used to read files from excel. These files can be split and read line by line so they are ideal for dataflow. You can use TextIO.Read to pull in each line of the file, then parse them as CSV lines.
If you want to use a different binary excel format, then I believe that you would need to read in the entire file and use a library to parse it. I recommend using CSV files if you can.
As for reading the GCS metadata. I don't think that you can do this with TextIO, but you could call the GCS API directly to access the metadata. If you only do this for a few files at the start of your program then it will work and not be too expensive. If you need to read many files like this, you'll be adding an extra RPC for each file.
Be careful to not read the same file multiple times, I suggest reading each file's metadata once once and then writing the metadata out to a side input. Then in one of your ParDo's you can access the side input for each file.
Useful links:
ETL & Parsing CSV files in Cloud Dataflow
https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Read
https://cloud.google.com/dataflow/model/par-do#side-inputs
private static final int BUFFER_SIZE = 64 * 1024;
private static void printBlob(com.google.cloud.storage.Storage storage, String bucketName, String blobPath) throws IOException, InvalidFormatException {
try (ReadChannel reader = ((com.google.cloud.storage.Storage) storage).reader(bucketName, blobPath)) {
InputStream inputStream = Channels.newInputStream(reader);
Workbook wb = WorkbookFactory.create(inputStream);
StringBuffer data = new StringBuffer();
for(int i=0;i<wb.getNumberOfSheets();i++) {
String fName = wb.getSheetAt(i).getSheetName();
File outputFile = new File("D:\\excel\\"+fName+".csv");
FileOutputStream fos = new FileOutputStream(outputFile);
XSSFSheet sheet = (XSSFSheet) wb.getSheetAt(i);
Iterator<Row> rowIterator = sheet.iterator();
data.delete(0, data.length());
while (rowIterator.hasNext())
{
// Get Each Row
Row row = rowIterator.next();
data.append('\n');
// Iterating through Each column of Each Row
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext())
{
Cell cell = cellIterator.next();
// Checking the cell format
switch (cell.getCellType())
{
case Cell.CELL_TYPE_NUMERIC:
data.append(cell.getNumericCellValue() + ",");
break;
case Cell.CELL_TYPE_STRING:
data.append(cell.getStringCellValue() + ",");
break;
case Cell.CELL_TYPE_BOOLEAN:
data.append(cell.getBooleanCellValue() + ",");
break;
case Cell.CELL_TYPE_BLANK:
data.append("" + ",");
break;
default:
data.append(cell + ",");
}
}
}
fos.write(data.toString().getBytes());
}
}
}
You should be able to read the metadata of a GCS file by using the GCS API. However you would need the filenames. You can do this by doing a ParDo or other transform over a list of PCollection<string> which holds the filenames.
We don't have any default readers for excel files. You can parse from a CSV file by using a text input:(ETL & Parsing CSV files in Cloud Dataflow)
I'm not very knowledgeable on excel, and how the file format is stored. If you want to process one file at a time, you can use a PCollection<string> of files. And then use some library to parse the excel file at a time.
If an excel file can be split into easily-parallelizable parts, I'd suggest you take a look at this doc (https://beam.apache.org/documentation/io/authoring-overview/). (If you are still using Dataflow SDK, it should be similar.) It may be worth splitting into smaller chunks before reading to get more parallelization out of your pipeline. In this case you could use IOChannelFactory to read from the file.

How to read from excel and write into excel in Protractor?

I have integrated with Rally which downloads test cases .Every Test case has its own test data in excel spread sheet form.
I am planning to consolidate all test cases excel data into single excel sheet and read the test data from this consolidated excel as part of data driven testing.
So I would like to know how to read from excel and write into excel in protractor.
Hope i am clear .
Thank you.
You can use one of these node packages.
https://www.npmjs.com/package/xlsx
https://www.npmjs.com/package/edit-xlsx
I think the second one would be ideal for you as you need to edit existing excel files.
I'm using Exceljs to make my test cases data driven.
https://www.npmjs.com/package/exceljs
Code sample for reading from excel:
var Excel = require('exceljs');
var wrkbook = new Excel.Workbook();
wrkbook.xlsx.readFile('Auto.xlsx').then(function() {
var worksheet = wrkbook.getWorksheet('Master');
worksheet.eachRow(function (Row, rowNumber) {
console.log("Row " + rowNumber + " = " + JSON.stringify(Row.values));
});
});

Dynamically changing CSV data source using ApplyLogOnInfo

I have a .rpt file that I have created by setting it's data source as a text (csv) file using the (Access/Excel (DAO) ) option.
Now I want the same .rpt file loaded using a C# code and each time my C# code will change the input file and I want a new report to be generated based on the data in the new text file.
I am doing the following code and when I export the file to a pdf document, it still displays the data according to the data in the old input file.
I have checked off the option in the .rpt file that says "save data with report" and "verify on first refresh".
What am I missing here?
CODE:
cryRpt = new ReportDocument();
cryRpt.Load(reportfile);
Tables tables = cryRpt.Database.Tables;
TableLogOnInfo tableLogonInfo;
foreach (Table table in cryRpt.Database.Tables)
{
tableLogonInfo = table.LogOnInfo;
tableLogonInfo.TableName = "MYdata_BS_NEW#csv";
table.Location = "MYdata_BS_NEW#csv";
table.ApplyLogOnInfo(tableLogonInfo);
}
cryRpt.Refresh();
// After this I export the report to pdf document.

How to read an .XLSX (Excel 2007) file using ADO.NET? I am finding "Could not find installable ISAM"-error

I need to work in .net 2.0. So I can't use OpenXML.
This is my source code and I have already Installed AccessDatabaseEngine.exe.
But still getting the exception:
"Could not find installable ISAM".
I have also tried "Extended Properties=Excel 8.0" in the connection string.
static void Main(string[] args)
{
DataSet dataSet = new DataSet();
OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=|Data Directory|\HSC.xlsx;Extended Properties=Excel 12.0;HDR=YES;");
OleDbDataAdapter dataAdapter= new OleDbDataAdapter("select * from [Sheet1$]", connection);
dataAdapter.Fill(dataSet);
}
According to Carl Prothman, that should be
Extended Properties="Excel 12.0 Xml;
-- http://www.connectionstrings.com/excel-2007
In more detail:
OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Docs\\Book2.xlsx;Extended Properties='Excel 12.0 xml;HDR=YES;'");
Note the single quotes.
I prefer to use the Microsoft OpenXML 2.0 SDK for this kind of functionality. It has a really nice interface, and it does not put a demand on having Office installed on the machine reading the XLSX file which is a good thing.
I'm writing this from my mobile, so hard to provide a link, but a Google search should easily find it for you.
Give it a try. I think you will like it.
EDIT
If you have to use .NET 2.0, you can go for using the JET variant of the OleDb instead.
That means you will do something like this to connect:
OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source='" + filename + "';" +
"Extended Properties=\"Excel 8.0;HDR=No;IMEX=1;\"";);
Then you can query it like in your example above:
OleDbDataAdapter objAdapter = new OleDbDataAdapter("select * from [Sheet1$]", connection);
Try it! Just note that Jet have some strange logic of deciding if a column is numeric or not. See the following SO questions for details: Problem with using OleDbDataAdapter to fetch data from a Excel sheet
You should make sure that the connection string looks like the following ( even if you are accessing microsoft excel version 10 ->
MyConnection = new System.Data.OleDb.OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;
Data Source='D:\csharp-Excel.xls';Extended Properties='Excel 12.0 Xml;HDR=Yes;'");