PDA

View Full Version : Windows 2003 Server R2 DFS-R A little help


fizzwheel
03-12-10, 08:49 PM
Be handy to get some thoughts from those in the know on here.

OK first off, what I can't do

Windows 2008 Server
Any other form of OS ( Unix, Linux etc etc )
Any form of applicance, netapp filer etc

I dont have budget to do any of this, so they are no starters.

I have 200gb ish of data that I would like to re organise, its mainly transient data that doesnt hang along round, that doesnt need to be secure, it isnt PCI sensitive or confidential in nature.

I need to replicate the data between two seperate physical data centres.

I have 1 x Windows 2003 Server with R2 on it setup and configured in both datacentres.

I want to remove the data from our existing file server structure as to be frank its poorly structured / organised and I want to start again with a clean sheet.

I want to break the dependance on the file server hosting the data's host name always being the same and I'd like to in the future to be abe to replicate / host on a 3rd node.

Been doing some reading and looks like DFS-R would do what I want, the data I have is within the Microsoft guidelines in terms of size.

I like the repilcation alogorhytm and the fact I can limit the amount of bandwidth its using to replicate.

What I'm trying to work out is there any and if so any idea how much additional load likely to be placed on our existing AD infrastructure, i.e. increased calls to Domain Controllers, i/o reads etc etc.

I know that I will need to pay attention to the staging area on each of the file servers, in terms of size.

My colleague thinks that there is, but when asked to justify it / produce evidence is unable to.

We are already using successfully DFS-R in another environment for a similar setup, albeit with a much smaller amount of data.

Any ideas or links to stuff to read would be useful, in the meantime I'lll keep googling.

I know I should ask this on a specific IT forum, but theres loads of knowledge on here and I'd appreciate a bit of help

SoulKiss
04-12-10, 09:16 AM
Be handy to get some thoughts from those in the know on here.

OK first off, what I can't do

Any other form of OS ( Unix, Linux etc etc )



And for that reason, I'm out.

Sorry Fizz, just don't do/know Windows in that kind of environment,and know you dont want to hear how much easier it would be to do otherwise.

fizzwheel
04-12-10, 10:38 AM
I'd quite happily look at other solutions, but the moneys not there for it in order to do it. We have a large solaris estate so we could do something with that, if one of the Unix administrators was up for diong something with it, which they are not. The mere suggestion of creating a Samba share to host a few spreadsheets on brought down mighty vengence upon me. Let alone 200gb worth of files etc etc.

andrewsmith
04-12-10, 10:54 AM
Thats what Samba is good at.
one way would to be create a separate RAID Server, set up the drive('s) file structure
and apply the system Architecture you want.

With loading the Server team should be able to give a rough level of the load thats presently on and the expected load the proposed should have at peak. Depending upon the way the system set up at the switch room (no of switches and amount of PC and terminals on each) the switches may be under capacity unless rationalisation has been undertaken.

Is the 3rd node hosting as a fail safe measure if the main server fails?


For the record Fizz I aint a networks expert but have basic experience of switching from a previous job

fizzwheel
04-12-10, 12:15 PM
Thats what Samba is good at.

I know, but I work with a team of people who arent very good about thinking around problems or outside of the box. I'd be happy to put it on samba, but they Unix team wont be.

I'm not worried about the network / or loading, As the data already exists on the servers I am looking to use to do this with, I just want to move it into a different folder and change the way the users access it, so rather than

mapping a drive to \\server1\share1

They access it via \\domainname\dfsroot\dfs targetname

Because that way removes the server name from the equation which means I can move the data around and it doesnt disrupt the users. What I am getting at is that the server thats hosting this might not always be the same one.

I have two servers hosting the data now and am using some third party software to replicate between to two. I want to ditch the third party software because its not particualrly good and then use DFS-R to replicate between the two servers. ( It can do this ) and then what I want to be able to do is add a third node at a later date, in a new data centre, replicate to that, then remove the node from the equation thats in the data centre that i am currently trying to shut.

If all this is hidden behind a dfs target that users shoudlnt notice anything.

My problem is, that the person who looks after AD, is throwing a hissy fit about extra load on the domain controllers, because they are telling me it'll create extra reads or lookups within ad each time somebody goes to lookup where the DFS target points to.

Remember I'm on about using DFS-R, not DFS or FRS and that the data isnt hosted on the domain controllers themselves neither will the domain controllers as I understand do the actual file replication, they'll just again if I have understood it properly, hold a pointer to the data.

-Ralph-
04-12-10, 12:58 PM
My problem is, that the person who looks after AD, is throwing a hissy fit about extra load on the domain controllers, because they are telling me it'll create extra reads or lookups within ad each time somebody goes to lookup where the DFS target points to

There would be additional load, I've no idea what that load would be, but I've never seen a problem with DFS hammering a DC to the point of having performance issues.

How many users are we talking and how often is this data accessed?

Your colleagues will have read something somewhere. The problem with the best practice info that MS pump out is that they all assume your network is getting hammered and you have to have it tuned for best performance otherwise you will have performance problems, and it assumes that a 5 or 10% performance hit will be an issue for you, but in reality what is the current utilisation of your AD servers? 10-20%?

Have you got access to a virtual platform? Set up a PoC with pair of file servers on VM's with DFS configured, switch over a copy of the production data over a weekend, and switch your login scripts to point at the DFS shares on the Monday morning. Everyone will come in and log on, and connect their mapped drives, and you can log perfmon on the AD servers for the day whilst the data is being used. If it is fine leave the PoC for a week, if it's still fine you can leave it for a month, then you will have proved it. If it's not fine then you can switch back the login scripts at any point. If you're confident that your WAN links have the bandwidth and stability for the replication traffic (which is the biggest pain point with DFS) then you can have to have the two PoC file servers on the same loacl network, as it's only impact on the DC's that you are trying to prove.

-Ralph-
04-12-10, 01:00 PM
My problem is, that the person who looks after AD, is throwing a hissy fit about extra load on the domain controllers, because they are telling me it'll create extra reads or lookups within ad each time somebody goes to lookup where the DFS target points to

There would be additional load, I've no idea what that load would be, but I've never seen a problem with DFS hammering a DC to the point of having performance issues.

How many users are we talking and how often is this data accessed?

Your colleagues will have read something somewhere. The problem with the best practice info that MS pump out is that they all assume your network is getting hammered and you have to have it tuned for best performance otherwise you will have performance problems, and it assumes that a 5 or 10% performance hit will be an issue for you, but in reality what is the current utilisation of your AD servers? 10-20%?

Have you got access to a virtual platform? Set up a PoC with pair of file servers on VM's with DFS configured, switch over a copy of the production data over a weekend, and switch your login scripts to point at the DFS shares on the Monday morning. Everyone will come in and log on, and connect their mapped drives, and you can log perfmon on the AD servers for the day whilst the data is being used. If it is fine leave the PoC for a week, if it's still fine you can leave it for a month, then you will have proved it. If it's not fine then you can switch back the login scripts at any point. If you're confident that your WAN links have the bandwidth and stability for the replication traffic (which is the biggest pain point with DFS) then you can have to have the two PoC file servers on the same local network, as it's only impact on the DC's that you are trying to prove.

Obviously remember to perfmon AD before you start so you have something to compare your PoC data against, and if your AD servers are already at 60% you know you probably shouldn't do it, without upgrading your AD infrastructure first.

fizzwheel
04-12-10, 01:30 PM
AD - I'd need to check, but IIRC they arent anywhere near being remotely over loaded.

WAN isnt a problem for replication as its going over a DWDM ( 2 x 1gb links ) low level of latency and not where near capacity, some of the rep traffic thats already going over that link is the file serving replication traffic that I am looking to change to use DFS-R with.

I do have Vmware, my trouble is the 200gb worth of data that I want to shift is buried into a right old mess and mis match of structure. Which I want to unpick. So thats a little more tricky to do with regard to building a POC and then pointing the users at it. Also the data isnt being managed so most of it is dormant and needs archving, which is another problem I need to sort.

I was thinking of setting the DFS-R up and then migrating anything thats been touched within the last 6 months and then ditching the older data anyway... so the intial load is probably going to be small.

Bear in mind we already have DFS-R running already on a different pair of servers, and thats running with on trouble at all, no network, no ad problems after we deployed it.

I'm assuming it is just the read from AD that puts the load on, and that the amount of load on AD has nothing whatsoever to do with the amount of data that DFS-R is sitting in front of ?

fizzwheel
04-12-10, 01:35 PM
Oh -Ralph- another question if you dont mind.

Am I re-inventing the wheel here ? or going off at a tanget, is this the sort of setup that you have see in any other companies ?

From the reading I have done I would say I'm probably not, but it would be useful to know if this is "best practise" or not...

flymo
04-12-10, 06:06 PM
We operate a very large deployment of DFS replicas (around 1500 servers), its extremely reliable.

You didnt mention wether or not the hosting servers were themselves domain controllers, they dont necessarily need to be. Ideally not so as to ensure DFS load is seperate, especially in large data environments like we host.

There is very little if any additional load on AD itself, the same replication engine is used but if these servers are not DCs then thats not an issue.

Calls to AD from clients are no more than would be necessary with simple domain joined file servers, once the client has the relevant access token then thats that.

When deciding which replica to access, a client will query based on the AD site information and will be referred to its nearest site in the site topology. These are simple DNS queries and introduce very little additional load in the scheme of things.

-Ralph-
04-12-10, 06:10 PM
I do have Vmware, my trouble is the 200gb worth of data that I want to shift is buried into a right old mess and mis match of structure. Which I want to unpick. So thats a little more tricky to do with regard to building a POC and then pointing the users at it. Also the data isnt being managed so most of it is dormant and needs archving, which is another problem I need to sort.

I'm assuming it is just the read from AD that puts the load on, and that the amount of load on AD has nothing whatsoever to do with the amount of data that DFS-R is sitting in front of ?

For a PoC I wouldn't bother unpicking the data, copy it and use it as it is onto 200GB of free space. You don't want to anyway because you want the PoC to be fairly invisible to the users, they want to hit the same drive letter and find the data in the same place. It also means that you are PoC on the worst case scenario, so if you have no performance problems with the data in a mess, you definately won't have any once you've cleaned up.

Yes, it is just the additional reads on AD from the DFS clients. What the impact is depends on whether you are spanning more than one domain. If you are just sites within a single domain, when a user tries to access data it will do a site discovery against AD to calculate which DFS server to use based upon your policies. You can script a DFSUTIL command into your logon script (can't remember the switches sorry, I don't do a hands-on technical job anymore) which will restrict the DFS client to using a particular server and it won't do this site discovery anymore. You do loose the redundancy benefits DFS though, because if that server goes down, the DFS client won't failover to another one.

If you are spanning multiple domains, then DFS clients will do a discovery on other domains every 15 minutes, again this can be disabled through the DFSUTIL. This discovery is against a DC in the domain which the client is joined to and hence holds the clients computer account.

Oh -Ralph- another question if you dont mind.

Am I re-inventing the wheel here ? or going off at a tanget, is this the sort of setup that you have see in any other companies ?

From the reading I have done I would say I'm probably not, but it would be useful to know if this is "best practise" or not...

Reinventing the wheel? No, doesn't sound like it. If you were purely doing this for DR there are other ways of doing it, or if you already have storage virtualisation (NetApp, Falconstor, Datacore, 3PAR, etc) then there are cleverer ways of doing it, but if what you want to achieve is improved user performance because they are not transiting WAN's to fetch data, improve availability by giving users more than one data source, and manage distributed data from a central point with a main copy for housekeeping, backup, virus scanning, etc, then DFS is the right way of doing it, and it's free! We use it all the time on hosted solutions where the customer wants a local file server, but wants the data centralised, managed and backed up in the Datacentre.

Quiff Wichard
04-12-10, 06:13 PM
have you tried turning it off and on again ?

flymo
04-12-10, 06:21 PM
Fizz, the good thing about DFS namespaces is that you can do your re-organising without initially moving any data. The namespaces and associated folders can be pointed to specific paths, initially on your existing file server(s) if necessary.

Then by adding additional hosts for particular parts of the folder structure you can move things around. If you have shofted users to start to use the DFS namespace then they will be unaware of where the data actually is and what the underlying folder structure looks like. Thats partly the point of DFS namespaces, you get to abstract the namespace from the real server and folder names.

fizzwheel
04-12-10, 06:39 PM
Fizz, the good thing about DFS namespaces is that you can do your re-organising without initially moving any data. The namespaces and associated folders can be pointed to specific paths, initially on your existing file server(s) if necessary.

Then by adding additional hosts for particular parts of the folder structure you can move things around. If you have shofted users to start to use the DFS namespace then they will be unaware of where the data actually is and what the underlying folder structure looks like. Thats partly the point of DFS namespaces, you get to abstract the namespace from the real server and folder names.

Thats precisley the reason I want to do this piece of work.

We have a single domain
The servers that data sits on are not DC's
Both the servers sit in the same forest.

We already have half a dozen DFS targets in place as I have been slowly introducing them. All I'm doing is adding another one really.

It would seem then in our scenario that the extra AD reads to support this would be small.

-Ralph-
04-12-10, 08:07 PM
Fizz, the good thing about DFS namespaces is that you can do your re-organising without initially moving any data. The namespaces and associated folders can be pointed to specific paths, initially on your existing file server(s) if necessary.

Then by adding additional hosts for particular parts of the folder structure you can move things around. If you have shofted users to start to use the DFS namespace then they will be unaware of where the data actually is and what the underlying folder structure looks like. Thats partly the point of DFS namespaces, you get to abstract the namespace from the real server and folder names.

Good point actually, it means you could do your PoC without moving any data. Most people avoid doing it that way for production, because it tends to fall by the wayside and the original data never gets moved or cleaned up. Most people prefer a clean start. For a PoC though that doesn't matter, saves a whole load of ballache moving your data and maintaining data integrity with the most up to date copy, which without some form of replication already in place would require scheduled downtime. You could just re-point the clients from an existing file share to the new DFS namespace. The impact on AD would be the same and it wouldn't affect the results of the PoC, the DFS clients would still be doing the same lookups.

-Ralph-
04-12-10, 08:11 PM
All this talk of PoC that I am doing, of course assumes that you would need to technically prove DFS to your AD administrators before they would accept it. If you can talk them round you don't need it. You already know that DFS works and it's well proven. Any bad reputation it has stems from W2000 FRS and is no longer valid.

fizzwheel
04-12-10, 08:22 PM
All this talk of PoC that I am doing, of course assumes that you would need to technically prove DFS to your AD administrators before they would accept it.

We have it already and have been using DFS-R for a small amount of data for 18 months or so with no problems, no failures and no performance impact on AD.

Yet on Friday it suddenly became a massive problem, that I'd like to add another DFS Target to our setup.

The problem here isnt at technical one, its bl**dy office politics again. I'm just after some ammo for Monday. I think also that my colleague isnt keen on it as we have been bitten in the bum by FRS before and one is being confused with the other due to lack of understanding.

The business have asked us to look at how we could re-organise the main file server in order to meet some audit requirements around data classification.

Rather than just creating another share or folder in amongst our existing mess, I wanted to start again with a clean sheet.

Seems to make sense to use DFS to me and it'll solve some other issues in the meantime as well and make everybodies life a little easier.

-Ralph-
04-12-10, 08:56 PM
There's a good paper on the web which talks about the improvements in DFS vs FRS and I've had to use it to convince a customer it was the way to go after they had been stung with FRS. Google should find it.

TBH though FRS carefully configured and with plenty bandwidth worked, but when W2000 came out a 256MB leased line was about 10 grand a year.