Two Arabic font representations in unicode - unicode
I am working on an application with arabic text and I found that there are always two ways to write the same text in arabic. I am unable to understand why is it happening and how can I convert anyone of them to the other to develop a consistent UI.
Here's is an example of an arabic phrase.
اللّهُمَّ صَلِّ عَلَى مُحَمَّدٍ وَآلِ مُحَمَّدٍ
اَﻟﻠّﻬُﻢﱠ ﺻَﻞﱢ ﻋَﻠﻰ ﻣُﺤَﻤﱠﺪٍ وَ ﺁلِ ﻣُﺤَﻤﱠﺪٍ
I see that it is looking the same in preview but it is different, I wanna achieve the same result.
Here's how I can differentiate these two in Notes
They look different in my browser and getting the Unicode code points for each gives me, in order posted
U+627 U+644 U+644 U+651 U+647 U+64F U+645 U+651 U+64E U+20 U+635 U+64E U+644 U+651 U+650 U+20 U+639 U+64E U+644 U+64E U+649 U+20 U+645 U+64F U+62D U+64E U+645 U+651 U+64E U+62F U+64D U+20 U+648 U+64E U+622 U+644 U+650 U+20 U+645 U+64F U+62D U+64E U+645 U+651 U+64E U+62F U+64D
U+627 U+64E U+FEDF U+FEE0 U+651 U+FEEC U+64F U+FEE2 U+FC60 U+20 U+FEBB U+64E U+FEDE U+FC62 U+20 U+FECB U+64E U+FEE0 U+FEF0 U+20 U+FEE3 U+64F U+FEA4 U+64E U+FEE4 U+FC60 U+FEAA U+64D U+20 U+648 U+64E U+20 U+FE81 U+644 U+650 U+20 U+FEE3 U+64F U+FEA4 U+64E U+FEE4 U+FC60 U+FEAA U+64D
Checking these the first three letters on the first line is ALEF, LAM, LAM and the second line is ALEF, FATHA, LAM INITIAL FORM
Which is odd because the initial form should not come in the middle of a word. Looks like your data is not correctly cleaned. I don't know, except for checking each letter, of a way to fix this.
Related
mongo-rust-driver v2.1.0 compilation failure
rustc 1.60.0-nightly (17d29dcdc 2022-01-21) running on x86_64-pc-windows-msvc I don't know how to solve, I deleted main to such an extent that it still reports an error main.rs: use std::error::Error; #[tokio::main] async fn main() -> Result<(), Box<dyn Error>> { println!("Hello, world!"); Ok(()) } error: Compiling mongodb v2.1.0 error: internal compiler error: compiler\rustc_mir_transform\src\generator.rs:755:13: Broken MIR: generator contains type ClientOptionsParser in MIR, but typeck only knows about {ResumeTy, impl AsRef<str>, std::option::Option<resolver_config::ResolverConfig>, bool, client::options::ClientOptions, [closure#C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:1100:69: 1100:90], impl futures_util::Future<Output = std::result::Result<SrvResolver, error::Error>>, (), SrvResolver, &Vec<client::options::ServerAddress>, Vec<client::options::ServerAddress>, usize, &client::options::ServerAddress, client::options::ServerAddress, &str, impl futures_util::Future<Output = std::result::Result<ResolvedConfig, error::Error>>} and [impl AsRef<str>, std::option::Option<client::options::resolver_config::ResolverConfig>] --> C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:1092:23 | 1092 | ) -> Result<Self> { | _______________________^ 1093 | | let parser = ClientOptionsParser::parse(uri.as_ref())?; 1094 | | let srv = parser.srv; 1095 | | let auth_source_present = parser.auth_source.is_some(); ... | 1145 | | Ok(options) 1146 | | } | |_____^ thread 'rustc' panicked at 'Box<dyn Any>', /rustc/17d29dcdce9b9e838635eb0adefd9b8b1588410b\compiler\rustc_errors\src\lib.rs:1115:9 stack backtrace: note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. note: the compiler unexpectedly panicked. this is a bug. note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md note: rustc 1.60.0-nightly (17d29dcdc 2022-01-21) running on x86_64-pc-windows-msvc note: compiler flags: -C embed-bitcode=no -C debuginfo=2 --crate-type lib note: some of the compiler flags provided by cargo are hidden query stack during panic: #0 [optimized_mir] optimizing MIR for `client::options::<impl at C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:973:1: 1261:2>::parse_uri::{closure#0}` #1 [layout_of] computing layout of `[static generator#C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:1092:23: 1146:6]` #2 [layout_of] computing layout of `core::future::from_generator::GenFuture<[static generator#C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:1092:23: 1146:6]>` #3 [layout_of] computing layout of `impl core::future::future::Future<Output = [async output]>` #4 [optimized_mir] optimizing MIR for `client::options::<impl at C:\Users\BORBER\.cargo\registry\src\mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd\mongodb-2.1.0\src\client\options\mod.rs:973:1: 1261:2>::parse_uri` end of query stack error: aborting due to previous error error: could not compile `mongodb` due to previous error
just use stable! Finished dev [unoptimized + debuginfo] target(s) in 0.61s Running `target\debug\my-mongodb.exe` Hello, world!
Wazuh child decoder not parsing field correctly
I am trying to parse a log as shown below with a child decoder in wazuh 4.x, for some reason its not parsing the needed field Log entry ossec: output: 'domainjoin-cli query|grep -i Domain': Domain = mydomain.local Child Decoder <decoder name="ossec-domain"> <parent>ossec</parent> <type>ossec</type> <prematch>^ossec: output:</prematch> <regex type="pcre2">^'domainjoin-cli[ \t]query|grep[ \t]-i[ \t]Domain':[ \t]Domain[ \t]=[ \t](\S+)</regex> <order>domain</order> </decoder> Output ossec: output: 'domainjoin-cli query|grep -i Domain': Domain = mydomain.local **Phase 1: Completed pre-decoding. full event: 'ossec: output: 'domainjoin-cli query|grep -i Domain': Domain = mydomain.local' **Phase 2: Completed decoding. name: 'ossec' parent: 'ossec' **Phase 3: Completed filtering (rules). id: '100008' level: '3' description: 'Server is in domain ' groups: '['ossec']' firedtimes: '1' hipaa: '['164.312.b']' mail: 'False' pci_dss: '['10.6.1']' **Alert to be generated.
Taking into account the parent decoder: <decoder name="ossec"> <prematch>^ossec: </prematch> <type>ossec</type> </decoder> First of all, you should delete the prematch tag since the parent has already a prematch regex. In case you want to leave the prematch, you can also use the offset field to indicate that the string output comes after ossec: . <decoder name="ossec-domain"> <parent>ossec</parent> <type>ossec</type> <prematch offset="after_parent>^output:</prematch> <regex type="pcre2">^'domainjoin-cli[ \t]query|grep[ \t]-i[ \t]Domain':[ \t]Domain[ \t]=[ \t](\S+)</regex> <order>domain</order> </decoder> After that, note that the regex is wrong as you are using ^. ^ indicates the beginning of the log and in this case, the string after that character is not the beginning of the log. You have to remove that character from regex. Also, you have to take into account that | indicates an OR operator which means that one regex (left) or the other (right) should match the log. In your use case, this should indicate the character so you will need to escape it not to use it as an OR operator. Taking into account these indications, the following decoder is the one you should use: <decoder name="ossec-domain"> <parent>ossec</parent> <type>ossec</type> <prematch offset="after_parent">^output:</prematch> <regex type="pcre2">'domainjoin-cli[ \t]query\|grep[ \t]-i[ \t]Domain':[ \t]Domain[ \t]=[ \t](\S+)</regex> <order>domain</order> </decoder> Logtest output: ossec: output: 'domainjoin-cli query|grep -i Domain': Domain = mydomain.local **Phase 1: Completed pre-decoding. full event: 'ossec: output: 'domainjoin-cli query|grep -i Domain': Domain = mydomain.local' **Phase 2: Completed decoding. name: 'ossec' parent: 'ossec' domain: 'mydomain.local' I hope this helps, if you have more problems please tell me the Wazuh version you are using and I will be glad to help.
How to print Arabic letters with thermal printer using Flutter esc_pos_printer library?
I'm using a printer: EPSON TM-m30 I'm currently using : esc_pos_printer: ^4.0.3 esc_pos_utils: ^1.0.0 When I run this code printDemoReceipt(NetworkPrinter printer) async { printer.text('ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ف ق ك ل م ن ه و ي'); printer.feed(2); printer.cut(); printer.disconnect(); } It causes this error [ERROR:flutter/lib/ui/ui_dart_state.cc(157)] Unhandled Exception: Invalid argument (string): Contains invalid characters.: "ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ف ق ك ل م ن ه و ي" _UnicodeSubsetEncoder.convert (dart:convert/ascii.dart:88:9) Did anyone fix this issue? Thank you.
Actually I have the same issue and I believe for now the only solution is to create a PDF then covert it to an image and then you can print it.
Try to import 'dart:convert' show utf8; printDemoReceipt(NetworkPrinter printer) async { final arabicText = utf8.encode('ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ف ق ك ل م ن ه و ي'); printer.textEncoded(arabicText); printer.feed(2); printer.cut(); printer.disconnect(); }
You have to find the character table for your printer and write the corresponding commands to let the printer know that you are printing multi byte characters such as Arabic characters. In my case, I was using sunmi printer and the only thing that worked for me was finding its character table and I wrote the commands and it worked very well. Here's a picture of what they said in the documentation. enter image description here And this is what I did and it worked perfectly const utf8Encoder = Utf8Encoder(); final encodedStr = utf8Encoder.convert(invoice.description); bytes += generator.textEncoded(Uint8List.fromList([ ...[0x1C, 0x26, 0x1C, 0x43, 0xFF], ...encodedStr ]));
Why does POSIX in conjunction with Win32 throw a warning with floor()?
I wrote the following relatively simple code below, to throw a popup box to remind me of daily tasks. #use Math::Round; use POSIX; use Win32; use strict; use warnings; my $basetime = 1484784000; #code with POSIX my $days = floor((time()-$basetime) / 86400); #code without POSIX #my $days = sprintf("%d", (time()-$basetime) / 86400); #code with Math::Round #my $days = Math::Round::nearest_floor(1, (time()-$basetime) / 86400); my $bigString = "We've been going for $days days.\n"; Win32::MsgBox($bigString); Now, the code works but throws a warning. The other two my $days work as well without throwing a warning. Here is the warning the POSIX function shows. Constant subroutine main::NULL redefined at C:/Strawberry/perl/lib/Exporter.pm line 66. at C:\coding\perl\posix-win32.pl line 3. Prototype mismatch: sub main::NULL () vs none at C:/Strawberry/perl/lib/Exporter.pm line 66. at C:\coding\perl\posix-win32.pl line 3. I don't think I ever used POSIX in conjunction with Win32, before, and I can see they're both calling a similarly named function, NULL. But I don't know what to do about it. I like using both modules, but obviously, I wouldn't want this cropping up in more complicated projects. What is going on to throw the warning, and how could I avoid it simply?
You're correct, both POSIX and Win32 export, by default, NULL. POSIX is a poorly behaved module that exports far, far, far too much by default (list at the bottom). To account for this, only import the functions you need. use Win32; use POSIX qw(floor); POSIX uses Exporter to accomplish this. See How To Import for more detail about controlling what gets imported. $ perl -wle 'use POSIX; print join ", ", #POSIX::EXPORT' isupper, isspace, fabs, F_GETLK, strncpy, EBADMSG, localeconv, SIGTRAP, ctermid, S_ISUID, fwrite, pow, strcoll, S_ISBLK, _POSIX_STREAM_MAX, EACCES, putc, FILENAME_MAX, tolower, sinh, EMLINK, ESOCKTNOSUPPORT, EDESTADDRREQ, DBL_MIN, fopen, TOSTOP, strncat, LINK_MAX, ENXIO, INLCR, TCION, NAME_MAX, EINPROGRESS, SIGILL, NDEBUG, VEOF, SEEK_END, ungetc, SEEK_CUR, STDOUT_FILENO, VEOL, ftell, UINT_MAX, ENOTEMPTY, DBL_EPSILON, INPCK, WIFSIGNALED, B134, remove, LC_TIME, SIGSEGV, _POSIX_PATH_MAX, F_RDLCK, SIG_BLOCK, VINTR, SA_NOCLDSTOP, PATH_MAX, isdigit, log10, O_RDWR, ENOTCONN, TMP_MAX, signal, F_SETLKW, qsort, O_TRUNC, _SC_TZNAME_MAX, _POSIX_NGROUPS_MAX, LC_COLLATE, _PC_NO_TRUNC, SCHAR_MAX, EHOSTUNREACH, fputs, ctime, fgetc, O_APPEND, _POSIX_ARG_MAX, EWOULDBLOCK, TCSAFLUSH, strstr, _exit, execle, malloc, DBL_MANT_DIG, _POSIX_SSIZE_MAX, puts, _SC_JOB_CONTROL, ttyname, B150, EMFILE, CS6, _POSIX_LINK_MAX, asin, mblen, _POSIX_PIPE_BUF, sigsuspend, B600, SIGPROF, L_ctermid, _SC_CLK_TCK, ceil, ECHILD, tmpfile, isprint, ECHOE, memset, ENOLINK, atexit, MAX_CANON, EADDRINUSE, sigprocmask, stderr, fscanf, modf, setpgid, tcgetpgrp, toupper, ENETRESET, B2400, raise, S_ISDIR, _SC_PAGESIZE, DBL_MAX_EXP, sysconf, EIDRM, F_SETFD, O_NOCTTY, EHOSTDOWN, FLT_MAX, CSTOPB, S_IRWXU, EPROTO, TCSANOW, S_IRWXO, setbuf, strchr, strerror, FLT_MIN_EXP, TCIOFF, tan, SIGCONT, EDQUOT, MB_CUR_MAX, _PC_PATH_MAX, SIGTTOU, SIGXCPU, EROFS, fdopen, _PC_VDISABLE, CHILD_MAX, ETXTBSY, S_ISCHR, SIGTTIN, VERASE, ESRCH, LONG_MAX, mbtowc, pause, sscanf, MB_LEN_MAX, O_WRONLY, fstat, _PC_MAX_INPUT, F_SETLK, SIGHUP, S_IXUSR, ETIME, DBL_MAX_10_EXP, execvp, ENOTSOCK, DBL_MIN_10_EXP, TCSADRAIN, isalnum, getchar, EMSGSIZE, TCIOFLUSH, _SC_NGROUPS_MAX, FLT_RADIX, ENOTDIR, _PC_LINK_MAX, strspn, S_IRWXG, _POSIX_NO_TRUNC, EXIT_SUCCESS, VKILL, acos, ERESTART, vprintf, EPFNOSUPPORT, IGNCR, _PC_MAX_CANON, STDIN_FILENO, strxfrm, _SC_VERSION, isxdigit, setsid, _POSIX_NAME_MAX, fmod, VSTART, B9600, FLT_MANT_DIG, islower, EXIT_FAILURE, clock, ENETDOWN, CS7, strrchr, SIGUSR2, tcdrain, INT_MIN, LDBL_DIG, _POSIX_JOB_CONTROL, SIG_UNBLOCK, _SC_STREAM_MAX, X_OK, F_UNLCK, ETIMEDOUT, CHAR_BIT, tmpnam, W_OK, sigpending, cfgetospeed, IEXTEN, geteuid, SIGRTMAX, E2BIG, LDBL_MIN, _SC_CHILD_MAX, CLK_TCK, NCCS, tzset, ENOMEM, gets, BRKINT, EDOM, ENODATA, ENOBUFS, ISTRIP, CLOCKS_PER_SEC, LDBL_MIN_EXP, SHRT_MIN, PARODD, EOF, asctime, ENFILE, EPROCLIM, freopen, sigaction, F_DUPFD, O_ACCMODE, FLT_MAX_10_EXP, difftime, TCOFLUSH, EINTR, ENOMSG, L_cuserid, B4800, EAGAIN, TCOON, setjmp, TZNAME_MAX, S_IWOTH, cuserid, PIPE_BUF, strtol, HUGE_VAL, F_GETFD, IGNPAR, EBUSY, memmove, ENOTBLK, getgid, SIGINT, EUSERS, SIGURG, EDEADLK, EOWNERDEAD, creat, _POSIX_MAX_CANON, _POSIX_CHOWN_RESTRICTED, execlp, F_SETFL, stdout, SIG_DFL, ldiv, SIGKILL, VSUSP, ENOTRECOVERABLE, B300, B200, HUPCL, WTERMSIG, offsetof, clearerr, tanh, getcwd, LDBL_MAX_10_EXP, SIG_SETMASK, ECHONL, O_NONBLOCK, S_IXOTH, ECONNABORTED, F_OK, tcflush, _POSIX_SAVED_IDS, SIGPIPE, _PC_NAME_MAX, ECANCELED, SIGCHLD, EREMOTE, FLT_MAX_EXP, SEEK_SET, getpid, B1800, NOFLSH, SIGUSR1, ECONNRESET, wcstombs, ESPIPE, WSTOPSIG, rewind, BUFSIZ, SIGABRT, STREAM_MAX, vsprintf, tcsendbreak, LDBL_MIN_10_EXP, pathconf, S_IRGRP, _SC_SAVED_IDS, OPOST, execv, feof, O_EXCL, access, sigsetjmp, mktime, fread, B1200, LC_MESSAGES, EXDEV, S_IROTH, longjmp, SA_RESETHAND, LC_ALL, ENOSYS, calloc, B110, FLT_EPSILON, assert, VQUIT, B50, ICANON, IXON, ECONNREFUSED, strftime, _PC_PIPE_BUF, ERANGE, SA_ONSTACK, ispunct, _POSIX_MAX_INPUT, WIFSTOPPED, ldexp, ENOLCK, EOTHER, _PC_CHOWN_RESTRICTED, PARENB, O_CREAT, STDERR_FILENO, ARG_MAX, ETOOMANYREFS, isatty, S_ISFIFO, SIGQUIT, abort, EPIPE, isalpha, USHRT_MAX, SA_RESTART, bsearch, IGNBRK, stdin, EPROTONOSUPPORT, ENOSPC, fgets, getegid, EAFNOSUPPORT, setvbuf, SIGTSTP, getuid, ESHUTDOWN, LONG_MIN, fgetpos, _POSIX_VERSION, frexp, %SIGRT, EADDRNOTAVAIL, F_WRLCK, lseek, EISDIR, atol, cfsetospeed, SIGALRM, fpathconf, B38400, L_tmpname, _POSIX_OPEN_MAX, ESTALE, LC_CTYPE, S_ISREG, WIFEXITED, EPROTOTYPE, SIG_IGN, EIO, ENAMETOOLONG, EPERM, atoi, isgraph, ENOENT, errno, MAX_INPUT, setuid, _SC_OPEN_MAX, S_IRUSR, siglongjmp, getenv, CS8, EINVAL, NULL, ECHO, LDBL_EPSILON, SCHAR_MIN, ENETUNREACH, uname, DBL_MAX, ENOPROTOOPT, SIGSTOP, strtoul, SA_NODEFER, CREAD, SIGBUS, mbstowcs, EFBIG, cfsetispeed, ISIG, FLT_MIN, SA_NOCLDWAIT, fsync, LDBL_MAX_EXP, ENOTTY, VMIN, strtod, TCIFLUSH, SA_SIGINFO, fclose, strcspn, strpbrk, SIGTERM, ENOSTR, ULONG_MAX, LC_NUMERIC, scanf, getgroups, vfprintf, ENOSR, FLT_ROUNDS, EEXIST, S_IWGRP, ENOEXEC, SIGVTALRM, SIGPOLL, memcmp, atan, putchar, _POSIX_CHILD_MAX, fflush, fsetpos, WEXITSTATUS, atof, EFAULT, memchr, strcat, VSTOP, _POSIX_TZNAME_MAX, LDBL_MAX, strlen, setlocale, FLT_MIN_10_EXP, cosh, tcgetattr, realloc, div, CHAR_MAX, fprintf, UCHAR_MAX, execve, B75, ICRNL, strcpy, ECHOK, FD_CLOEXEC, cfgetispeed, iscntrl, strtok, SSIZE_MAX, SIGSYS, S_ISGID, strncmp, EISCONN, labs, CLOCAL, R_OK, memcpy, F_GETFL, VTIME, dup, EALREADY, fseek, strcmp, SIGXFSZ, dup2, wctomb, SHRT_MAX, SIGFPE, SIG_ERR, _SC_ARG_MAX, setgid, execl, RAND_MAX, CSIZE, tcflow, CS5, LC_MONETARY, TCOOFF, _POSIX_VDISABLE, PARMRK, perror, mkfifo, ENODEV, S_IXGRP, WNOHANG, ferror, WUNTRACED, floor, INT_MAX, EOPNOTSUPP, OPEN_MAX, LDBL_MANT_DIG, DBL_DIG, SIGRTMIN, CHAR_MIN, tzname, O_RDONLY, B0, tcsetattr, tcsetpgrp, ELOOP, EOVERFLOW, S_IWUSR, IXOFF, EILSEQ, DBL_MIN_EXP, ENOTSUP, EBADF, B19200, free, fputc, NGROUPS_MAX, FLT_DIG
Cucumber, Rspec: unicode symbols in output
I wonder if it is possible to make Cucumber output matching errors in Russian instead of this: Сценарий: Успешное добавление кгиги # features/books/add_book.feature:12 Если я добавил книгу # features/step_definitions/books_steps.rb:3 То я должен увидеть добавленную книгу # features/step_definitions/books_steps.rb:15 expected there to be content "\320\235\320\260\320\267\320\262\320\260\320\275 \320\270\320\265 \320\272\320\275\320\270\320\263\320\270" in "\320\236\321\210\320\270\320 \261\320\272\320\260 502!\n... Where "\320\235\320\260\320\267\320\262\320\260\320\275" is a Russian word. It may be a feature of Rspec. Any Ideas would be great.
Adding $KCODE='u' to my features/support/env.rb helped a little: А должен увидеть сообщение о том, что пароль неверен expected there to be content "Неверный прол\321\214" This solution is only for 1.8.7 – in 1.9.3 # encoding: utf-8 works just fine