rulururu

post Short precious code snippet

August 12th, 2009

Filed under: .NET — Kai @ 12:45 am

I was recently talking about linq features with a friend. Now I saw somebody wanted to break a loop after 50 iterations. Trivial but also for that there’s a precious solution using linq.

int processed = 0;
foreach(ListViewItem lvi in listView.Items)
{
   //do stuff
   ++processed;
   if (processed == 50) break;
}

use linq

foreach(ListViewItem lvi in listView.Items.Take(50))
{
    //do stuff
}

or, you’re right “old” style would be

for(int i=0; i < listView.Items.Count && i <= 50; i++)
{
   ListViewItem lvi = listView.Items[i];
  //do stuff
}

post E-Mail obfuscation - a disputed question

August 11th, 2009

Filed under: General Programming, Internet, Security — Kai @ 5:15 pm

Many users and forum programs in attempt to make automatic e-mail address harversting harder conseal them via obfuscation - @ is replaced with “at” and . is replaces with “dot”, so

bill.gates@microsoft.com

now becomes

bil dot gates at microsoft dot com

I’m not an expert in regular expressions and I’m really curious - does such obfuscation really make automatic harvesting harder? Is it really much harder to automatically identify such obfuscated addresses?

For example, if every email address on a large community site is reversed in the markup and rendered properly with CSS, or token-replaced (@ becomes ‘at’), or any other predictable method, the harvesters will just write a thin adapter for your site.

Think of it this way: if it only takes you one line of code to “scramble” them sitewide, it will only take the harvester one line of code to “unscramble” them for your site. Roughly speaking.

What concept is the right? Do more complex obfuscation or consider about new ways?

Obfuscation techniques fall in the same category than captchas. They are not reliable and tend to hurt regular users more than bots.

Javascript obfuscation seems to be praised, but is no silver bullet: it is not that hard today to automate a browser for email sniffing. If it can be displayed in a browser, it can be harvested. You could even imagine a bot that’s taking screenshots of a browser window and using OCR to extract addresses to beat your million-dollar-obfuscation-technique.

Depending on where and why you want to obfuscate emails, those techniques could be useful:

  • Restrict email visibility: you may hide emails on your website/forum to anonymous users, to new users (with little to no activity or posts to date) or even hide them completely and replace email contact between members with a built-in private messaging feature.
  • Use a dedicated spam-filtered email: you will get spammed, but it will be limited to this particular address. This is a good trade-off when you need to expose the email address to any user.
  • Use a contact form: while bots are pretty good at filling forms, it turns out that they are too good at filling forms. Hidden field techniques can filter most of the spam coming through your contact form.

One common way of hiding email from bots and spammers is to create an image containing the email address. Facebook does this, for instance. Now, using images for email is inherently bad for accessibility, because text readers will not be able to read it. But even otherwise, there are several free character recognition programs that do a pretty good of decoding such email-images.

At least you have always to keep in brain that if it’s difficult for the spammers it’s as well your users to identify the email address. A nice article from wikipedia on Email obfuscation or address munging you’d pay regard to.

The real question is whether the extra effort will be put in by harvesters and if the (major? minor?) barrier to the harvesters is worth the possible problems for your users.

Finally this article is as so many about fighting spam - In my opinion, spam has become such a problem and so many databases have been turned over that we’re beyond hiding our addresses. Instead, consider of more efficient ways of classifying and blocking spam.

post Firefox and its market share

July 1st, 2009

Filed under: Internet — Kai @ 4:12 pm

Until Wednesday noon over 3.8 million users worldwide downloaded the new version of Firefox from the Internet. Firefox 3.5 is more than twice as fast working as the previous version, and above all more stable running. The software now uses better the performance, the modern computer with multi-core processors available. According to the developers over 5000 new freatures are in the sources, of course most only will be recognized at second glance.

Nevertheless it’s amazing (or maybe it should make me blue) that the market share of Firefox is still just about 22 percent. In comparison to that the market share of the Internet Explorer is about 65 percent.

It seems to me that besides a lot of companies whose hands are tied using another browser caused by a partnership or a similar contract with Microsoft there must be a great number of normal pc users that seem to like IE…

post LINQ to Amazon

July 1st, 2009

Filed under: .NET — Kai @ 3:55 pm

As you know you can query with LINQ lots of different sourcse. It’s possible to query, project and filter data in arrays, enumerable classes, XML (XLINQ), relational database, and third party data sources. Last year I among others wrote something about LINQ & XML.

After doing some query we get results of a as a collection of in-memory objects that can be enumerated using a standard iterator function such as C#’s foreach.

I found a funny Provider called Linq to Amazon which allows querying Amazon for books using Linq! It uses Linq’s extensibility to allow for language-integrated queries against a book catalog. The Linq query gets converted to REST URLs supported by Amazon’s web services. These services return XML. The results are converted from XML to .NET objects using Linq to XML.

For the moment, let’s look at the client code:

var query =
  from book in new Amazon.BookSearch()
  where
    book.Title.Contains("darkness and light") &&
    (book.Publisher == "Hitchcock") &&
    (book.Price <= 25) &&
    (book.Condition == BookCondition.New)
  select book;

I think this code speaks for itself! This is Linq to Amazon code. It expresses a query against Amazon, but does not execute it… The query will be executed when we start enumerating the results.

The following piece of code converts from Linq to Amazon to Linq to Objects:

var sequence = query.ToSequence();

The var might remind you to the times of Flash, VB or JavaScript. But those times were baaad, very bad. Nowadays we’re strongly typed or even better generic:

A good alternative is:

IEnumerable<SomeOtherClass> results = ...

The returned catalogue can now be grouped, filtered more detailed or just thrown away. Do whatever you like todo ;-)

There are many more, more or less usefull ones:

  • LINQ to SharePoint
  • LINQ to Amazon
  • LINQ to Active Directory (LDAP)
  • LINQ to NHibernate
  • LINQ to MySQL / Oracle / SQLite
  • LINQ to Flickr

Have fun ;)

post Determine OS with ruby

June 1st, 2009

Filed under: Ruby — Kai @ 6:30 pm

I’m developing a tiny application that should run on (almost) any operating system. Though to that I sometimes have to do some switches (e.g. for console coloring).
The constant RUBY_PLATFORM helps me, so that I wrote a small module

module OsHelper
 
  def is_linux?
     RUBY_PLATFORM.downcase.include?("linux")
  end
 
  def is_windows?
     RUBY_PLATFORM.downcase.include?("mswin")
  end
 
  def is_mac?
    RUBY_PLATFORM.downcase.include?("darwin")
  end
 
end

Unfortunately I quickly realized that it’s more a bad than a good idea because RUBY_PLATFORM will return 'java' when using JRuby for example.

I found sys-uname library that gives much more information than that constant can do. (gem install sys-uname)

Finally you can use it like that:

begin # use Sys::Uname library if present
        require 'sys/uname'
        @@os_name = Sys::Uname.sysname
        @@architecture = Sys::Uname.machine
        @@os_version = Sys::Uname.release
      rescue # otherwise use shell
        @@os_name = `uname -s`.strip
        @@architecture = `uname -p`.strip
        @@os_version = `uname -r`.strip
      end

I hope it someday will help someone ;)

post Are you redundant or backed up?

February 2nd, 2009

Filed under: Security — Kai @ 11:21 pm

So, for the reason that quite a while ago my external HD crashed; I do not think that that might be of value or interest to people reading my blog; I have to tell you something. While it turns out I didn’t lose any data, thanks to my backup stategy I improve from time to time.

I bet anyway some of you probably don’t have a backup solution for your machine. Don’t tell me I didn’t warn you when your HD crashes. And it will crash, it’s just a matter of time!

Some years ago, after a very time-consuming loss of data, I wanted to solve this new problem of backing up data in the best possible way, so I started researching all this stuff that I never really paid attention to before. I started looking into external HDs, NAS boxes (because it would be cool to stream data to my home network in addition to providing storage for my PC), RAID, and everything in-between. But it probably took a weekend of research into these things before I realized the the simple yet so very important distinction between data redundancy and data backup.

As you probably know there are different RAID-Levels most of them provide besides an improvment of read/write speed a data security.

Redundancy is something you get e.g. by having two or more HDs in a RAID1 or RAID5 configuration. If one HD fails, you can recover your data from the other HDs, either due to mirroring or due to having a parity disk that will allow your data to be recreated. However, this is not a backup!

If your whole PC is fried, or whatever it might be, your data is irreparably lost. Backup is something that should protect you from data loss even in the case of a severe hardware failure.

My solution is very simple provided by rsync that copies data from one pc to another (via ssh). Very simple - but just as effective & safe.

When thinking about backup solutions I got this obviously overstated idea: If my backup files are in, let’s say Hong Kong while I’m in Nuremberg I’ll be able to recover my files even if Nuremberg falls into the ocean from a earthquake (assuming I survive the ordeal).
Okay, you’re right - I’m loosing track of reality, of course that’s not necesarry at all. Just a few years ago this would have been inconceivable for private purpose but nowadays it’s not such a bad idea… ;-)

Nevertheless I continue in copying data to a machine that’s just a few arm length from my workstation.

ruldrurd
« Previous PageNext Page »
Powered by WordPress, Content and Design by Kai Bellmann
Entries (RSS) and Comments (RSS)